Rusty-machine is a machine learning library written entirely in Rust.
It focuses on the following:
Works out-of-the-box without relying on external dependencies.
Simple and easy to understand API.
Extendible and easy to configure.
Another machine learning library?
Machine Learning
"Field of study that gives computers the ability to learn without being explicitly programmed." - Arthur Samuel
How do machines learn?
With data.
Some examples
Predicting rent increase
Predicting whether an image contains a cat or a dog
Understanding hand written digits
Data set might be:
rent prices and other facts about the residence.
labelled pictures of cats and dogs.
many examples of hand written digits.
Some terminology
Model : An object that transforms inputs into outputs based on information in data.
Train/Fit : Teaching a model how it should transform inputs using data.
Predict : Feeding inputs into a model to receive outputs.
To predict rent increases we may use a Linear RegressionModel. We'd train
the model on some rent prices and facts about the residence. Then we'd predict the rent of unlisted places.
Before we go any further we should see an example.
K-Means
A model for clustering.
Using a K-Means Model
// ... Get the data samples// Create a new model with 2 clustersletmut model = KMeansClassifier::new(2);
// Train the model
model.train(&samples);
// Predict which cluster each point belongs tolet clusters : Vector<usize> = model.predict(&samples);
Get some initial guesses for the centroids (cluster centers)
Assign each point to the centroid it is closest to.
Update the centroids by taking the average of all points assigned to it.
Repeat 2 and 3 until convergence.
K-Means Classification
Simple but complicated
The API for other models aim to be as simple as that one. However...
Machine learning is complicated.
Rusty-machine aims for ease of use.
How does rusty-machine (try to) keep things simple?
Using traits
A clean, simple model API
Extensibility at the user level
Reusable components within the library
Extensibility
We use traits to define parts of the models.
While rusty-machine provides common defaults - users can write their own implementations and plug them in.
Extensibility Example
Support Vector Machine
/// A Support Vector Machinepubstruct SVM<K: Kernel> {
ker: K,
/// Some other fields/* ... */
}
pubtraitKernel {
/// The kernel function.////// Takes two equal length slices and returns a scalar.fnkernel(&self, x1: &[f64], x2: &[f64]) -> f64;
}
Combining kernels
K1(x1, x2) + K2(x1, x2) = K(x1, x2)
pubstruct KernelSum<T, U>
where T: Kernel,
U: Kernel
{
k1: T,
k2: U,
}
/// Computes the sum of the two associated kernels.impl<T, U> Kernel for KernelSum<T, U>
where T: Kernel,
U: Kernel
{
fnkernel(&self, x1: &[f64], x2: &[f64]) -> f64 {
self.k1.kernel(x1, x2) + self.k2.kernel(x1, x2)
}
}
Combining kernels
K1(x1, x2) + K2(x1, x2) = K(x1, x2)
let poly_ker = kernel::Polynomial::new(...);
let hypert_ker = kernel::HyperTan::new(...);
let sum_kernel = poly_ker + hypert_ker;
letmut model = SVM::new(sum_kernel);
Reusability
We use traits to define common components, e.g. Kernels.
These components can be swapped in and out of models.
New models can easily make use of these common components.
Reusability Example
Gradient Descent Solvers
We use Gradient Descent to minimize a cost function.
All Gradient Descent Solvers implement this trait.
/// Trait for gradient descent algorithms. (Some things omitted)pubtraitOptimAlgorithm<M: Optimizable> {
/// Return the optimized parameters using gradient optimization.fnoptimize(&self, model: &M, ...) -> Vec<f64>;
}
The Optimizable trait is implemented by a model which is differentiable.
Creating a new model
With gradient descent optimization
Define the model.
/// Cost function is: f(x) = (x-c)^2struct XSqModel {
c: f64,
}
You can think of this model as learning the value c.
Creating a new model
With gradient descent optimization
Implement Optimizable for model.
/// Cost function is: f(x) = (x-c)^2struct XSqModel {
c: f64,
}
impl Optimizable for XSqModel {
/// 'params' here is 'x'fncompute_grad(&self, params: &[f64], ...) -> Vec<f64> {
vec![2f64 * (params[0] - self.c)]
}
}
Creating a new model
With gradient descent optimization
Use an OptimAlgorithm to compute the optimized parameters.
/// Cost function is: f(x) = (x-c)^2struct XSqModel {
c: f64,
}
impl Optimizable for XSqModel {
fncompute_grad(&self, params: &[f64], ...) -> Vec<f64> {
vec![2f64 * (params[0] - self.c)]
}
}
let x_sq = XSqModel { c : 1.0 };
let x_start = vec![30.0];
let gd = GradientDesc::default();
let optimal = gd.optimize(&x_sq, &x_start, ...);
Python is especially exciting as we gain access to lots of tooling.
Rusty-machine
James Lucas
Disclaimer: I'm a mathematician by training so things may get heavy.
I'll do my best to explain but please interrupt me if I'm not making sense.