Machine Learning 101
Computers have become ubiquitous in this day and age, permeating every facet of our lives. Considered to be Turing complete, they are capable of performing a set of given instructions, unwaveringly. While this is perfect for tasks involving exact mathematical processes, say for instance, calculating pi to over 22 trillion digits, they can’t perform simple tasks a child could, such as distinguishing a cat from a bird, or reading handwriting. Machine learning is the field of computer science aiming to achieve these feats, by giving computers the ability to operate without explicit instruction.
There are a variety of different approaches and algorithms being studied, from decision trees to inductive logic programming. One technique in particular, artificial neural networks, tries to emulate what some might say is the most advanced computer ever known, the human brain. Neural networks, as the name implies, are simply networks of interconnected nodes known as neurons. Neurons process multiple inputs, and decide whether or not to pass a signal on to the next one. The simplicity of individual neurons allows them to be interconnected into vast networks, amplifying their processing power with each layer.
As ever-larger artificial neural networks are built with more and more layers, hidden layers grow deeper and deeper, this is where the term “Deep Learning” comes from.
From Biology to Bits
Biological neurons communicate through a type of chemical called neurotransmitters. These molecules bind with receptors on the cell’s dendrites and alter the electric potential across its membrane by allowing ions to flow in and out of the cell. When a large enough electric potential is produced, it propagates through the neuron, down its axon, where it releases neurotransmitters across a synapse to the dendrites of the next neuron.
All of that is a little convoluted to try to simulate, but we can reduce the process to a mathematically model-able one. Inputs can be anything as complicated as a dimension of a dataset to as simple as a binary yes or no. An artificial neuron will multiply each input by a variable weight, then sum the results. Input weights are adjusted through a process called backpropagation (We’ll get more into this later), which allows the network to evolve and produce “better” results (by whatever metric that’s measured) over time. To determine whether or not the neuron will “fire,” and what signal to pass to the next, the sum of the weighted inputs are passed through an activation function. A biological neuron’s activation function is simply a threshold that must be surpassed, but an artificial neuron’s can be more complicated. One such activation function commonly employed is the sigmoid function.
The sigmoid function, courtesy Wikimedia Commons.
The sigmoid function is an “S”-shaped curve and can map the weighted input sum to a value between 0 and 1 (Which helps simplify calculations with normalized data).
Making Mistakes
Much like how no one is born knowing how to ride a bike, artificial neural networks must also be taught and trained how to process inputs correctly. There are two schools of thought regarding learning: supervised and unsupervised. Backpropagation is a common training method that can be used in both settings, though is generally considered a supervised learning technique. A two part process, it begins with error propagation. The network’s output is compared to a desired output, and the error in the output is calculated by a loss function. This error is propagated backward through the network to determine each neuron’s contribution. These error values are then used to calculate the gradient of the loss function. The gradient is the slope or “steepness” of the loss function. The lowest point of the loss function is where error is smallest. The goal of the second part of the process is to find the input weights that will lead to this minimum. Optimization methods, such as gradient descent, are used to calculate new input weights for the network. These input weights are updated with the new, “optimized” values, and the process is repeated.
3-D plot of a neuron's error with two inputs, courtesy Wikimedia Commons.
For a step-by-step example of backpropagation, check out this tutorial.
The Tip of the Iceberg
Machine learning and artificial neural networks are exciting tools that are already integrating into the technology we use everyday. The scope of their use and abilities is only growing. While there are still many unique and perplexing challenges to overcome, one day our interactions with them may become as casual and universal as smartphones are today.