Machine Learning and Deep Learning - Simplified - a Lot
Machine learning as the word implies getting a machine to derive outcomes out of complex scenarios/ environments (i.e. inputs) without a human instructing how to do it.
What is learning and how do we learn?
Learning as we recall in our memories is often linked with teaching. We assume that we were taught alphabet with our teacher or mother starting it with “A for apple”. But did it stop there? If it stopped there we would have been just stuck with apple to represent the letter A.
We actually learned letter “A” to be such a powerful element of written language after our eyes identified the shape of “A”, after we started recognizing that shape is not just present in the word “Apple” but in many other places.
We learned that in most of the places where “A” was present, that piece of word was pronounced in a similar way. We identified a pattern and we expected that the pattern would persist. We made a guess or a judgment on what that shape “A” would do based on the patterns that we observed.
That is learning.
Learning and adaptation is a capability that humans and other animals possessed but machines did not. Machines always relied on humans to instruct on how to behave, and yes humans did get tired of that. Humans tried to get machines to self-learn via algorithms that were designed to identify patterns in input data.
But what exactly does machine learning do? And under what circumstances? So how can we simplify this further?
A typical class room scenario, a “judgmental” teacher!
Let’s take an example of a class room of 10 students. These students are all sitting for one same exam and studying the same curriculum. The teacher periodically assesses these students via in class tests and assignments. He also maintains an attendance register. He asks questions in the class and keeps a track of students who tend to answer more. Based on all these inputs he makes some calculations in his brain (he doesn’t know that he runs them, he calls it judgment) and arrives into following conclusions
Mark will obviously get more than 80% - he makes a prediction
Zara, Mark and Jess are the stars, Toby, Kyle and Serena can be pushed towards 70% score, the others I have not much trust on – he clusters/ segments the students who show similar behavior
When the number of students are high, lecturer fails to do the same calculation in his head. How is he supposed to “remember” a wide array of data and sort/ analyze them?
He was then promoted as the Dean of the same university. Now he is responsible for thousands of students instead of just 10. Despite him running the same tests and assignments, it is impossible for him to “judge” the students based on those outcomes just in his head.
As the number of students increased by 100X, the amount of data that was associated with them also grew by the same amount. So he builds a mechanism/ model where when he enters the test results and attendance details of each of the students where the computer gives the outputs he was looking for. It is called machine learning.
Machine learning deduces results out of complex data sets, however with human assistance. How the machine was assisted by the human in this case? Here the lecturer knew what the variables to look at are and what the number of students to look at is. He subjected the data into a high level classification and purification before sending it to the machine. And machine did the difficult part by running the complex calculation and “remembering” it to proceed to the final step to derive the desired outcomes.
An education minister who tries not to “judge” the skill level of the fish by its ability to climb a tree
Few years down the line, the lecturer enters in to politics and he is now the minister of education. He is a smart education minister and believes that a student’s capabilities cannot be measured by a just a standard set of examinations. Hence he talks with the heads of all the universities and emphasizes the importance of offering university admission to those bright students who are stars “in their own way”. While some students are with very strong in analytical skills, some are really good leaders. And there are artists, sportsmen and future entrepreneurs. He believes that all of them deserve an equal chance of entering into the university.
To identify these star students, he requests schools to start recording all kinds of achievements of the students, not just academic. Their grades, performances in school concerts, sports events, leaderships they held and even informal instances where they made an impact to the school and fellow students voluntarily.
And he requests a pool of scientists to derive the desired outcome (i.e. identifying star students) based on these data. In addition to the formally collected data via schools, each identifiable cyber data by the students will also be considered in the process.
An enormous ocean of data
The data that was collected belonged to millions of students. And it had all kind of data. Their grades (under different curricula and subjects), sports achievements, arts, display of leadership skills, their social media interactions, and any content that they published online.
These data are essentially large in volume, unstructured in nature and practically indefinite in terms of the number of variables/ features.
It is humanly impossible to do that analysis; period. We as humans cannot remember that much
Human assisted machine learning will again take lot of effort if the humans tried to identify/ grade these data and trying to link them to the final desired outcome
If there was something which identifies complex patterns like a human did, but had a memory of a computer, this task could be achieved
Essentially a machine with neurons. Since neurons are living things, scientists built things which mimics neurons called “neural networks”. Such a machine was fed this ocean of data (or commonly called “Big Data”), and the machine ran calculations using its “brain”. Doing calculations, learning something at each step and applying it at the next step, whilst “identifying and remembering” all the patterns.
This is what you call “Deep Learning”.
In both cases of Machine Learning and Deep learning
There are algorithms which mimic the human judgment and logic by “learning” things out of each experiment through identifying patterns
Far superior than the humans in terms of memory. Machines in deed do have perfect memory
However, the differences are
Machine learning deal with large yet finite data sets, finite features and variables. Requires human assistance to determine the variables which will have an impact on desired outcome
Deep learning deal with enormous and almost infinite data sets which are unstructured in nature, large pool of features and variables. It identifies, sorts and weighs these data by itself without human assistance
What happened to the lecturer?
If we start a story, we better end it.
Yes, the lecturer (now the minister of education) was able to deduce some good results out of his deep learning algorithms on student performance. But he let the both systems (traditional and new) to run in parallel for few years.
He fed back the results/ performance/ achievements of the students who entered to the university via new system to the deep learning model. He expects via this feedback loop the model will be able to better tune the result and about 3 – 5 years down the line he can make a real change to the country’s education system.