By : Abhishek Hingne | On : May 18, 2017
The perceptron is a mathematical model of a biological neuron. As we know that the dendrites receive signal from the axon of other neurons, in perceptron these electrical signal can be represented by numerical values. Electrical signals are attuned at the synapses between the dendrites and the axon, in perceptron this phenomenon can be modeled as multiplying each input value by some other value called weights.
The weighted sum of input values represents the strength of the input values. Biological neuron fires only if it reaches a certain threshold, which can be modeled by multiplying step function to the strength of the input values to get the required output. Unlike biological neurons, the output of a perceptron is feed to the other perceptrons.
Perceptron, forms the basic building blocks of any neural networks. Consider the case where we have labeled data points in a plane, which should be separated into two groups. An elementary approach for such kind of problem is to draw a line that separates the two groups.
In this scenario, the separator line function as a linear classifier. Data points on one side of the separator line belongs to a particular class whereas those on the other side belongs to another.
Figure 1 : Schematic of the structure of a neuron (Source : Goolge Images)
This linear classifier can be represented mathematically as:
where, x is an input vector, w are weight vectors and b is a bias.
An activation function H(x) for a perceptron, for this particular case, which produces the results can be represented as,
In the previous section, we learned the fundamental working of perceptrons. In this section, we will go through the geometrical interpretation of what happens when a perceptron learns. But, prior to this, we should have some understanding of weight space.
It is a high dimensional space where each data point corresponds to a particular setting for all the weights. Each dimension of this weight space represents a particular weight of a perceptron.
Each training case or each data point represents a plane in this high dimensional weight space. And, each point in the weight space corresponds to a weight vector. To get the correct result for a particular training case, the weights must lie on one side of the hyperplane. We will describe this with an illustrative example as shown in the figure.
Figure 2 : Schematic of Perceptron (Source: Goolge Images)
The training case, we're going to think for now, defines a plane and is shown by a black line which passes through the origin and is perpendicular to the input vector for that training case. Also, the training case belongs to a class 1. For this kind of training case, the weights must lie on one side of the hyperplane to get the correct result.
For any vector, like the green one, the angle between the input vector and weight vector is less than 90 degree and hence the scalar product is positive and we get the correct result. Conversely, consider the weight vector, like the red one, which makes an obtuse angle with the input vector. Thus, the scalar product is negative and we get the wrong answer.
Now, consider the case, as shown in the figure, which belongs to a class 0. For any vector, like the green one, the angle between the input vector and weight vector is greater than 90 degree and hence the scalar product is negative and we get the correct result.
Conversely, consider the weight vector, like the red one, which makes an acute angle with the input vector. Thus, the scalar product is positive and the perceptron will identify it as 1, which is wrong.
Figure 3 : Geometrical representation of Perceptron (Source : Lecture Slides on NeuralNetworks by Geoffrey Hinton)
Now, let's put these both training case in one picture weight space. It is clear from figure that there is a cone of possible weight vectors. And any weight vector which lies inside these feasible cone will give the correct answer. It is not necessary that such feasible cone always exits.
So, what the learning algorithm needs to do is to consider one training case at a time and move the weight vector in such a direction that it will lie eventually in this cone. But, all this is based on one very important assumption that such feasible cone exists.
Figure 4:Geometrical representation of Perceptron (Source: Lecture Slides on NeuralNetworks by Geoffrey Hinton)