• December 24, 2020

Discover how machine learning works

Discover how machine learning works

Discover how machine learning works 1024 341 DataLit

Author: Francesca Morpurgo

Machine Learning (in short: ML) is one of the Artificial Intelligence research fields that is getting more attention at the moment, probably because it can bring to the table really effective results in very interesting fields, for example customer profiling, user behavior end intent to buy, hyper targeted advertising (for an example of the many possibilities offered by Machine Learning and AI have a look at DataLit, our AI solution for Publishers). But how does machine learning work? Let’s see it in some detail!

How does machine learning work confronted with how human learning works

We are all familiar with how human learning happens: we basically learn from experience, adding order and making generalizations out of a mass or raw data, in a process that involves endless tentatives and errors. Machine Learning tries to simulate the human learning with respect to the ability of making guesses or previsions when presented with precedently unknown data, on the basis of our past experiences (historical data) and of the generalizations we made out of them.

Of course, to make sense of data we use the theories we have precedently learnt or tacitly introjected. Machine Learning works in the same way: it uses an algorithm chosen because it is adequate to the data sets and to the business problem it has to address; the algorithm is the theory that the computer uses to learn (to systematize, order and analyze the data) and to develop a model that it will use to make predictions on future outputs.

The model that the computer builds up is a mathematical model, a probabilistic function that is used to understand how inputs and outputs correlate and so to forecast future outputs given inputs that are not in the original data set.

Kinds of Machine Learning

There are four kinds of Machine Learning

  1. Supervised Machine Learning
  2. Unsupervised Machine Learning
  3. Self-supervised Machine Learning
  4. Reinforcement Machine Learning

The kind you choose depends mainly on the data you have and on the results you need. 

Supervised Machine Learning

In Supervised Machine Learning you pre-label the data with which you train the algorithm, just like when you explain to a child what a tree is showing him different instances of trees and not-trees. The algorithm in this way learns how to classify new instances of data on the basis of the labels you provided.

Unsupervised Machine Learning

In Unsupervised Machine Learning you only provide the data without labels, and the algorithm finds by himself regularities or patterns into data, then uses these findings to develop a model.

Self-Supervised Machine Learning

In this kind of machine learning labels are contextually discovered by the algorithm itself (using embedded metadata, domain knowledge, or other strategies) and aftermath used for developing the model.

Reinforcement Machine Learning

In reinforcement Machine Learning you tell (“reward”) the algorithm when it succeed in forecasting something right. In this way it manages to learn faster.

How does Machine Learning work: the Machine Learning process

But in practical terms, how does Machine Learning work? We can identify 7 main steps.:

First step: identify and define the business problem

If you want your Machine Learning to be effective, first of all you have to be sure to define clearly your business problem, namely the question or the questions you want to be answered. If you underestimate the importance of this step you are going to be disappointed by the results you’ll obtain.

Second step: identify and structure the data set

One of the factors that made possible the recent boost of Machine Learning is the availability of tons of data (the big data revolution) coming not only from the internet but also from the internet of things and from many other sources (e.g. big ecommerce companies). Once you have chosen your data set you have to split it into two distinct sets, the training set and the control set. You are going to use the first one to train your algorithm and the second one to check if the results you obtain once you deploy it are correct. Then you have to clean up the data (for example there may be redundancies, or missing areas). Also, depending on the type of machine learning you are going to use you may want to structure or label your data. Obviously, the better data you have, the better your Machine Learning model will work. 

Third step: decide which machine learning algorithm is more fitted to your problem and your data set

“Machine Learning” means essentially finding a probabilistic function that can accurately describe the relationship between known inputs and known outputs so that when presented with new inputs it can forecast with minimal error the likely outputs. This is made through an algorithm that processes the data you feed to it and finds patterns and relationships between data. It will be necessary to spend some time experimenting with algorithms to choose the best one for your problem and your data.

Fourth step: train the algorithm and develop the Machine Learning model

Once you have the data set and the algorithm it’s time to launch the program and this is where the effective “learning” happens: using the training set, where you’ll have couples of inputs and outputs (e.g. customers of a certain age, gender and income and products they buy) the algorithm will “learn” how the two relate and by developing a Machine Learning model it will become able to forecast with a certain amount of accuracy probable future outcomes (e.g. what a certain kind of customer is likely to buy). Then you have to check the model against your control set, to see if the forecasts are sufficiently near to the real results. If this is the case, you can proceed to test your model against real not encountered before data

Fifth step: test the Model against new data

At this point you have to try your model with new data, and see if it delivers the expected results. Of course errors may happen, and you must be very careful at this point because the computers only deliver outputs according to a certain function it discovered but don’t really “understand” them: it’s up to you to decide if the results it gives to you are as expected or not. If not, you should make adjustments.

Sixth step: if necessary adjust the algorithm or the data

If you realize that your Machine Learning model is not delivering the expected results, the problem may be in the algorithm or in the data you used for the training step. For example you could provide it with more specific or more structured data.

Seventh step:  retrain periodically the algorithm

Once you have a model that works the job is not finished: you have to retrain periodically the algorithm with new data, so that it can keep its results up to date with the latest developments. Machine Learning, just as human learning, is a process that can never be  considered truly finished.

Machine Learning: a reality for your business

Of course the entire process it’s not something easy to set up, even though it offers really interesting possibilities to anyone involved in the digital ecosystem. This is the reason why the best thing if you think your business could profit from a Machine Learning solution is to look for an AI solution, like DataLit, that can offer ready to go solutions. For example taking advantage of its advanced analytics system DataLit.AI uses Machine Learning and AI technologies to hyper profile customer users and translating the results in extremely targeted and so effective campaigns. Just one of the many possibilites offered by DataLit AI technology.