What is machine learning? A machine learning algorithm (an algorithm is nothing but a set of rules or steps to achieve some outcome), is being trained how to play the classic Atari game Breakout. Ten minutes into the game, it was clumsy and missed the ball. Give it some more time, and it plays better than a human! The differentiation here is that instead of programing a traditional if-then-else construct which explicitly directs machines to take rules-based decisions; we instead create algorithms that allow them to learn how to perform a particular task optimally, getting better and better with each iteration.
Now the first thing to note is that machine learning is not new. Check out this 1959 definition by Arthur Samuel: “Field of study that gives computers the ability to learn without being explicitly programmed”.
Arthur Samuel is perhaps the father of machine learning. He certainly coined the term. The Samuel checkers-playing program is probably the world’s first self-learning program, and an early demonstration of a broader concept: artificial intelligence.
A very small history lesson
Reading Samuel’s 1959 quote, you may have guessed that machine learning and AI in general are oldish. Which period do you think marks the beginning of AI? The 1930s, ’40s, ’50s? Before reading on, take a guess.
The conception of AI is actually ancient. Thousands of years ago, the Greeks discussed the possibility of placing a mind inside a mechanical body. This automaton was exemplified by Talos, a legendary mechanical man created to protect the land from pirates and invaders. As early as the seventeenth century, intellectuals including Gottfried Leibniz, Thomas Hobbes and Rene Descartes imagined that all thought could be represented as mathematical symbols (an idea which is fundamental to neural networks).
Ancient and medieval roots aside, the true ‘modern’ birth of AI can be attributed to Alan Turing’s publication of ‘Computing Machinery and Intelligence’ in 1950. The paper gave rise to the famous Turing test. This test involved a computer, a human player and a human judge in a game. If the judge is unable to differentiate the human from the computer based on their interactions, the computer wins the game.
Over the next several decades, there were many milestones in the development of machine learning. These ranged from the conceptualization of the first neural network (called Perceptron), an algorithm inspired by how neurons behave in the brain; to IBM Watson’s defeat of the world champions at Jeopardy! in 2011. This was a triumph of natural language processing and represented a significant leap forward for ‘cognitive’ technologies, as Jeopardy! is not a mathematically precise rules-based game like chess where the number of possible moves are limited.
Besides these big milestones, think about how we encounter machine learning every day, without even realising it. From the more obvious virtual assistants like Siri and Cortana (notice how they seem to get smarter after every interaction with you), to chatbots that can easily be mistaken for human service agents, recommender systems on search engines and websites, your ride-sharing app telling you the ETA of your next ride, the autopilot on the next flight you take, and your robo vacuum cleaner. Machine learning, and in a broader sense, AI, is creeping into virtually every aspect of our lives.
So what is the difference between AI, Machine Learning, and the even more mysterious Deep Learning?
You may have noticed that I use the term ‘machine learning’ instead of AI. Most experts agree that ‘true AI’ is still very far from reality. In other words, the Terminator is not going to be around blowing things up very soon, unless of course it is sent from the very distant future:-)
As a rule of thumb, machine learning is a subset of AI. And another form of highly specialised machine learning called deep learning is a subset of machine learning.
Deep learning deals with learning data representations, rather than task-specific algorithms. Deep neural networks, deep belief networks and recurrent neural networks have shown remarkable results in computer vision, speech recognition, natural language processing, machine translation and bioinformatics. Deep learning, requiring greater computational power than simpler machine learning algorithms, have come to the fore largely because of advances in computational technology such as the creation of GPUs and CPU-GPU architecture.
Now that you have a small taste of what machine learning is and some of its examples, let’s get you started on understanding your very first ML algorithm—linear regression.
Let’s take an example of housing prices in relation to the size in square feet of a house.
We have a table as follows:
Size in square feet (X)———————————————-> Price ($) in 1000s (Y)
The goal of linear regression would be to fit a straight line to this dataset, which most accurately represents the relationship between X and Y. And then be able to predict the price of the house (‘Y’), given the size in square feet of the house (‘X’).
The linear regression hypothesis can be represented in many ways. One such way is:
[A function which takes x and some paramaters as its input to compute the value of Y]
HB(x) = B0*x1+ B1*x2……+ Bnxn
B0 and B1 in this case, are the weights or parameter values, whereas x1 through xn represent the features.
Though the housing example above is a simple univariate linear regression (i.e. with just one variable to predict the price of the house: the size in square feet), in a case where we use multiple variables (e.g. location of the house, age, and number of bedrooms); the weight assigned to each variable will become more important. For example, how much weight (or importance) should be attributed to the size of the house, as compared to the location of the house, in determining its price?
The goal of training the model in this case, would be to minimize the distance between the actual values of the price of the house, and the values predicted by our algorithm, until the cost function of the algorithm becomes 0. When this happens, we will have achieved a perfect linear regression model to predict the price of the house, given its size in square feet.
Too simple? But remember that the intent of machine learning and particularly deep learning is about enabling machines to learn complex concepts by breaking them down into simpler concepts. Companies like PayPal use linear regression, in combination with other algorithms, to detect fraud.