sanchit.tiwari@metro-services.in

Machine Learning – Let us Make it simple…

Blog Post created by sanchit.tiwari@metro-services.in on Jun 28, 2016

Many a times when I interact with stake holders from business side and start explaining our data science products first question I get is what is machine learning on which then I try me best to explain it in simple terms. I felt that I should write an article which can be reference for not only the users of the data science products but also people who are curious to know more about machine learning.

Actually way back in 1959 Arthur Samuel gave a very simple definition around machine learning “[Machine Learning is the] field of study that gives computers the ability to learn without being explicitly programmed.” I really like to use this line whenever I start talking about machine learning as it is sum of many things.

And more recently, in 1997, Tom Mitchell gave a “well-posed” definition that has proven more useful to engineering types: “A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”

So if you want your program to predict, for example, customers who will churn from your business (task T), you can run it through a machine learning algorithm with data about past customer’s purchasing behavior (experience E) and, if it has successfully “learned”, it will then do better at predicting future churners (performance measure P).

This is very simple but machine learning is used to solve many real life complex problem is all most every field. If you feel there is any field where machine learning is not applied or cannot be applied just let me know and I would be curious to know more around that field. To make your life easier here are few problem statements where machine learning has been applied with great success:- “Which product to recommend?”, “which customer will respond on this campaign?”, “Who is fraud”?, “Is this cancer?”, “What is the market value of this house?”, “Which of these people are good friends with each other?”, “Will this rocket engine explode on take off?”, “Will this person like this movie?”, “Who is this?”, “What did you say?”, and “How do you fly this thing?”.

Hope this helps..:-) actually machine learning solves problems that cannot be solved by numerical means alone and that’s why I am saying that it can be applied to any filed

Machine learning tasks can be classified in several categories, the main ones are:

  • supervised ML.
  • unsupervised ML.
  • reinforcement learning.

Now let me explain in simple words the kind of problems that are dealt with by each category. Supervised ML relies on data where the true label/class was indicated. Let me try to give you simple example suppose we want teach a computer to distinguish pictures of different colors. We can download pictures of different color with the label for example Picture with red color will have red label, yellow will have yellow and so on. So now we know the true color label of the picture and can use this data to supervise our algorithm in learning the right way to classify the colors. Once our algorithm learns how to classify the images based on the color we can use new data and predict label (i.e. Red, yellow etc) on previously unseen picture.Other example can be simple labeling the spam and ham mails and then use the labeled data to supervise our algorithm so that it can classify the new mails to spam or ham.

On the other hand in unsupervised learning we don’t have labels and we just pass the data to algorithm which divides the based on the some characteristic, for example suppose in previous example we forgot to label the emails but still we want to divide the email in two groups then we can use unsupervised ML to separate your emails in two groups based on some inherent features of the emails.

Another well-known class of machine learning problems is called reinforcement learning. This class of ML problems can be easily illustrated by an example of learning to play chess. As input to this problem ML receives information about whether a game played was won or lost. So ML does not have every move in the game labelled as successful or not, but only has the result of the whole game. Therefore ML algorithm can play a lot of games and each time gives bigger “weights” to those moves that resulted in a winning combination.

Under these main categories there are multiple algorithms and I am putting some of them which comes under different categories however I am not going to explain any of them but you can always each out to me in case you want to learn more around any algorithm, I am just putting these as your reference so that next when you hear these techniques that should not look as alien to you..:-)

 

Decision Trees:-

  • CART
  • 5
  • 0
  • CHAID
  • Conditional Decision Tree

Clustering:-

  • Partitioned( K-means, K-medoids)
  • Hierarchal Clustering
  • Density based Clustering

Regression:-

  • Linear Regression
  • OLSR
  • Step wise regression
  • Logistic Regression

Ensemble:-

  • Bagging
  • Boosting
  • Random Forest
  • Gradient Boosting
  • Ada Boosting
  • Gradient Boosting Regression Trees

Dimensionality Reduction:-

  • Principal Component Analysis(PCA)
  • Principal Component Regression (PCR)
  • Linear Discriminant Analysis (LDA)
  • RDA
  • FDA
  • MDA
  • Partial Least Square Discriminant Analysis

Regularization:-

  • Lasso
  • Ridge
  • Elastic Net

Neural Network:-

  • Perceptron
  • Back Propagation

Deep Learning:-

  • DBN
  • DBM
  • CNN

Bayesian:-

  • Naive Bayes
  • Gaussian Naive Bayes
  • Bayesian Network

Instance Based:-

  • KNN
  • LVQ
  • LWL

List is very exhaustive and we have many more machine learning algorithm to list and I can keep writing those as they keep coming in mind however idea of this article is not to list everything but just to give you an idea about machine learning in nutshell so that if are new to machine learning then this serve as starting point for you to basic understanding and if you are experienced then just an example on how you educate your stake holders from business side. Let me know your thoughts/comments and suggestion on making machine learning more simple for business.

Outcomes