Machine Learning with Numl

I came across a really good library recently called Numl. Numl is a .NET library that aims to make Machine Learning much more accessible to developers by abstracting away all the complex parts. What is Machine Learning I hear you say. Machine Learning can be described as (and this is a quote from the Numl site).

The purpose of machine learning is to find (and exploit) patterns in data. Traditionally developers, when faced with a problem, develop and algorithm and write code. Certain classes of problems, however, do not lend themselves to this approach. With machine learning, the developer instead supplies relevant data to the machine and allows the computer to create the appropriate algorithm.

Numl does this via supervised and unsupervised learning. These are :

Machine Learning with Numl

Machine Learning with Numl

  • Supervised Learning : Supervised learning is the branch of machine learning that deals primarily with prediction. Given examples, supervised learning algorithms create models that generalize the decision making process. In essence, the machine learns from the past in order to accurately predict the future.
  • UnSupervised Learning : Unsupervised learning is the branch of machine learning that strives to understand the structure of data. This data, unlike supervised learning, does not have a predefined outcome that requires prediction but is vast enough to require a principled approach to either visual or physical compression.

Supervised learning is that part that I find most interesting and useful for my purposes. When using supervised learning you provide a labelled set of examples, and this can be provided as lists of objects, data tables etc. These data sets contain examples of how decisions were made previously.

This collections of objects is then converted into a matrix and a vector. The matrix columns represent each feature used to make decisions while each row represents a numerical representation of each object. The vector is a list of answers corresponding to each matrix row. The Descriptor object is a mapping between object properties and their corresponding numerical representation. The computer then uses the previously generated data to train a model. The shape of the model depends entirely on the learning algorithm chosen. It can be a single vector, a tree, or even a collection of points.This model will then predict the target label given a new object of the same shape used during training.

Supervised Learning – Simple Example

Numl is very easy to set-up in Visual Studio. The library is managed by NuGet. So, from the NuGet console, you just need to type :

Install-Package numl

I will quickly run through the simple example used on the Numl site. Obviously this is a very simplistic example, but it will give you a good idea of what to expect. In this example we want to determine whether under certain weather conditions that is it ok to play a game of Tennis. First lets look at some sample data.

This is just static hard coded data, normally you would use a larger data set from a database etc.

    public enum Outlook
    {
        Sunny,
        Overcast,
        Rainy
    }

    public enum Temperature
    {
        Low,
        High
    }

    public class Tennis
    {
        [Feature]
        public Outlook Outlook { get; set; }
        [Feature]
        public Temperature Temperature { get; set; }
        [Feature]
        public bool Windy { get; set; }
        [Label]
        public bool Play { get; set; }

        public static Tennis[] GetData()
        {
            return new Tennis[]  {
                new Tennis { Play = true, Outlook=Outlook.Sunny, Temperature = Temperature.Low, Windy=true},
                new Tennis { Play = false, Outlook=Outlook.Sunny, Temperature = Temperature.High, Windy=true},
                new Tennis { Play = false, Outlook=Outlook.Sunny, Temperature = Temperature.High, Windy=false},
                new Tennis { Play = true, Outlook=Outlook.Overcast, Temperature = Temperature.Low, Windy=true},
                new Tennis { Play = true, Outlook=Outlook.Overcast, Temperature = Temperature.High, Windy= false},
                new Tennis { Play = true, Outlook=Outlook.Overcast, Temperature = Temperature.Low, Windy=false},
                new Tennis { Play = false, Outlook=Outlook.Rainy, Temperature = Temperature.Low, Windy=true},
                new Tennis { Play = true, Outlook=Outlook.Rainy, Temperature = Temperature.Low, Windy=false}
            };
        }
    }

In this example we have a Tennis class that contain some properties. Outlook, Temperature and Windy are data points that are used to train the model. The property called Play is the outcome, which in this case is whether to play a game of Tennis or not.

If we look at the first example in the Get Data method, then we will play a game of Tennis if the outlook is sunny, the temperature is low, yet it is windy.

Once we have our test data we then need to train the model. This is done with only a few lines of code.

Tennis[] data = Tennis.GetData();
var d = Descriptor.Create<Tennis>();
var g = new DecisionTreeGenerator(d);
g.SetHint(false);
var model = Learner.Learn(data, 0.80, 1000, g);

In this case the Learner uses 80% of the data to train the model and 20% to test the model. The learner also runs the generator 1000 times and returns the most accurate model.

Now that the the model has been train we can now start to make predictions from it. In the example on the site, they look to see if it is ok to play a game of tennis if, the outlook is overcast, the temperature is cool, and it is windy.

Tennis t = new Tennis
{
Outlook = Outlook.Overcast,
Temperature = Temperature.Cool,
Windy = true
};

model.Predict(t);

In this example the expected result is that Play will return true.

The process for using the supervised machine learning algorithms is primarily a two step process. The first step is the instantiation of a Generator object. The generator object then produces the actual model.

The actual supervised algorithms are executed in the generator object. Some of these models are computationally expensive and take some time to complete. Each generator class is paired with a model that represents the output of the machine learning algorithm. Although these vary in size and functionality, their ultimate goal is to predict based upon the learned model.

Currently this library contains the following algorithms for supervised learning:

  • Perceptron
  • Kernelized Perceptron
  • K Nearest Neighbors
  • Decision Trees
  • Naive Bayes
  • Neural Networks

I think this library is fantastic and certainly has many uses. Numl makes what is normally a very complicated discipline, a lot more simple to digest. Whilst the library is very easy to use, you will need to experiment with generating different kinds of models and testing the outcomes. The more data you can train the system with the better.

5 thoughts on “Machine Learning with Numl

  1. qwertie

    I have occasionally seen articles about particular machine learning algorithms, which can typically be summed up as “okay so you define a bunch of input neurons, hidden neurons and output neurons and now you’ve got an AI! Congratulations!” and I am left with the distinct impression they forgot to tell me something important, like how to wire up the neurons or why they work the way they do…

    What I have not seen yet out on the web is an explanation of how to select a machine-learning algorithm based on the nature of the input data and the desired outcome. I also have not seen a guidebook for figuring out what kinds of things are best solved with machine learning versus what kinds of things are best solved by traditional algorithms and heuristics. As an experienced non-AI programmer, every problem to me looks like it should be solved with an algorithm. Using AI simply does not occur to me.

    Personally I’m interested in natural language processing. I wonder where NLP and AI intersect. I know I should just go back to university for it, but my local uni has no experts in NLP. *rambling detected. killing process.*

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s