Rusty-machine is a machine learning library written **entirely** in Rust.

It focuses on the following:

- Works out-of-the-box without relying on external dependencies.
- Simple and easy to understand API.
- Extendible and easy to configure.

With data.

- Predicting rent increase
- Predicting whether an image contains a cat or a dog
- Understanding hand written digits

rent prices and other facts about the residence.

*labelled* pictures of cats and dogs.

many examples of hand written digits.

**Model**: An object that transforms*inputs*into*outputs*based on information in data.**Train/Fit**: Teaching a model how it should transform*inputs*using data.**Predict**: Feeding*inputs*into a model to receive*outputs*.

To predict rent increases we may use a *Linear Regression* **Model**. We'd **train**
the model on some rent prices and facts about the residence. Then we'd **predict** the rent of unlisted places.

There are many, many models to choose from.

There are many, many ways to use each model.

```
pub trait Model<T, U> {
fn train(&mut self, inputs: &T, targets: &U);
fn predict(&self, inputs: &T) -> U;
}
```

A model for *clustering*.

The API for other models aim to be as simple as that one. However...

Machine learning is complicated.

Rusty-machine aims for ease of use.

```
/// A Support Vector Machine
pub struct SVM<K: Kernel> {
ker: K,
/// Some other fields
/* ... */
}
```

```
pub trait Kernel {
/// The kernel function.
///
/// Takes two equal length slices and returns a scalar.
fn kernel(&self, x1: &[f64], x2: &[f64]) -> f64;
}
```

```
pub struct KernelSum<T, U>
where T: Kernel,
U: Kernel
{
k1: T,
k2: U,
}
```

```
/// Computes the sum of the two associated kernels.
impl<T, U> Kernel for KernelSum<T, U>
where T: Kernel,
U: Kernel
{
fn kernel(&self, x1: &[f64], x2: &[f64]) -> f64 {
self.k1.kernel(x1, x2) + self.k2.kernel(x1, x2)
}
}
```

```
let poly_ker = kernel::Polynomial::new(...);
let hypert_ker = kernel::HyperTan::new(...);
let sum_kernel = poly_ker + hypert_ker;
let mut model = SVM::new(sum_kernel);
```

We use traits to define common components, e.g. *Kernels*.

These components can be swapped in and out of models.

New models can easily make use of these common components.

We use Gradient Descent to minimize a *cost* function.

```
/// Trait for gradient descent algorithms. (Some things omitted)
pub trait OptimAlgorithm<M: Optimizable> {
/// Return the optimized parameters using gradient optimization.
fn optimize(&self, model: &M, ...) -> Vec<f64>;
}
```

The **Optimizable** trait is implemented by a model which is differentiable.

Define the model.

```
/// Cost function is: f(x) = (x-c)^2
struct XSqModel {
c: f64,
}
```

You can think of this model as *learning* the value **c**.

Implement **Optimizable** for model.

```
/// Cost function is: f(x) = (x-c)^2
struct XSqModel {
c: f64,
}
impl Optimizable for XSqModel {
/// 'params' here is 'x'
fn compute_grad(&self, params: &[f64], ...) -> Vec<f64> {
vec![2f64 * (params[0] - self.c)]
}
}
```

Use an **OptimAlgorithm** to compute the optimized parameters.

```
/// Cost function is: f(x) = (x-c)^2
struct XSqModel {
c: f64,
}
impl Optimizable for XSqModel {
fn compute_grad(&self, params: &[f64], ...) -> Vec<f64> {
vec![2f64 * (params[0] - self.c)]
}
}
let x_sq = XSqModel { c : 1.0 };
let x_start = vec![30.0];
let gd = GradientDesc::default();
let optimal = gd.optimize(&x_sq, &x_start, ...);
```

Ease of use

- Trait system is amazing.
- Error handling is amazing.
- Performance focused code*.

* Rusty-machine needs some work, but the future looks bright!

- Optimizing and stabilizing existing models.
- Providing optional use of BLAS/LAPACK/CUDA/etc.
- Addressing lack of tooling.

From Scikit-learn's FAQs.