I recently started reading
Michael Nielson's "Neural Networks and Deep Learning".
I have heard really good things about the website and Nielson's teaching style. Here are my
solutions to exercise 1.
Sigmoid Neurons Simulating Perceptrons
Part 1
Question
Suppose we take all the weights and biases in a network of perceptrons, and multiply them by a
positive constant, c>0. Show that the behavior of the network doesn't change.
Solution
The definition of a perceptron is:
{01if ∑jwjxj≤thresoldif ∑jwjxj>thresold
or more nicely with the bias:
{01if b+∑jwjxj≤0if b+∑jwjxj>0
Now if we multiple the bias and weight by the constant c the result is:
{01if cb+∑jcwjxj≤0if cb+∑jcwjxj>0=={01if cb+c∑jwjxj≤0if cb+c∑jwjxj>0={01if c(b+∑jwjxj)≤0if c(b+∑jwjxj)>0={01if b+∑jwjxj≤0if b+∑jwjxj>0
We end up back where we started from and thus multiplying by a constant c (assuming c>0) does
not affect the behavior of the network.
Part 2
Question
Suppose we have the same setup as the last problem - a network of perceptrons. Suppose also that the
overall input to the network of perceptrons has been chosen. We won't need the actual input value,
we just need the input to have been fixed. Suppose the weights and biases are such that
w⋅x+b=0 for the input x to any particular perceptron in the network. Now replace all the
perceptrons in the network by sigmoid neurons, and multiply the weights and biases by a positive
constant c>0. Show that in the limit as c→∞ the behavior of this network of sigmoid
neurons is the same as the network of perceptrons. How can this fail when w⋅x+b=0 for
one of the perceptrons?
Solution
Sigmoid Neuron Defintion
1+exp(−∑jwjxj−b)1=1+exp(−w⋅x−b)1
Assumptions
Assume that x is a fixed unknown value where w⋅x+b=0.
Then the resulting network will remain unchanged (assuming c>0) as c→∞.
By the way exp(x)≡ex
The Cases
1+exp(c(−w⋅x−b))1
Let's assume that w⋅x+b<0:
c→∞lim1+exp(c(−w⋅x−b))1==1+exp(limc→∞c(−w⋅x−b))1=1+exp(∞)1=1+∞1=0
Let's assume that w⋅x+b>0:
c→∞lim1+exp(c(−w⋅x−b))1==1+exp(limc→∞c(−w⋅x−b))1=1+exp(−∞)1=1+01=1
This is exactly the behavior we would expect with a perceptron!
How Can This Fail?
If w⋅x+b=0 is true then:
c→∞lim1+exp(c(−w⋅x−b))1==1+exp(limc→∞c(0))1=1+exp(0)1=1+11=21
limc→∞1+exp(c(−w⋅x−b))1=21=0
This does not match the expected behavior of a perceptron.