Back to overview
Weights and Biases
Neural Net Problems - Exercise 1
March 19, 2020

I recently started reading Michael Nielson's "Neural Networks and Deep Learning". I have heard really good things about the website and Nielson's teaching style. Here are my solutions to exercise 1.

Sigmoid Neurons Simulating Perceptrons

Part 1

Question

Suppose we take all the weights and biases in a network of perceptrons, and multiply them by a positive constant, c>0. Show that the behavior of the network doesn't change.

Solution

The definition of a perceptron is:

{01if jwjxjthresoldif jwjxj>thresold

or more nicely with the bias:

{01if b+jwjxj0if b+jwjxj>0

Now if we multiple the bias and weight by the constant c the result is:

{01if cb+jcwjxj0if cb+jcwjxj>0=={01if cb+cjwjxj0if cb+cjwjxj>0={01if c(b+jwjxj)0if c(b+jwjxj)>0={01if b+jwjxj0if b+jwjxj>0

We end up back where we started from and thus multiplying by a constant c (assuming c>0) does not affect the behavior of the network.

Part 2

Question

Suppose we have the same setup as the last problem - a network of perceptrons. Suppose also that the overall input to the network of perceptrons has been chosen. We won't need the actual input value, we just need the input to have been fixed. Suppose the weights and biases are such that wx+b=0 for the input x to any particular perceptron in the network. Now replace all the perceptrons in the network by sigmoid neurons, and multiply the weights and biases by a positive constant c>0. Show that in the limit as c the behavior of this network of sigmoid neurons is the same as the network of perceptrons. How can this fail when wx+b=0 for one of the perceptrons?

Solution

Sigmoid Neuron Defintion
1+exp(jwjxjb)1=1+exp(wxb)1
Assumptions

Assume that x is a fixed unknown value where wx+b=0.

Then the resulting network will remain unchanged (assuming c>0) as c.

By the way exp(x)ex

The Cases
1+exp(c(wxb))1

Let's assume that wx+b<0:

clim1+exp(c(wxb))1==1+exp(limcc(wxb))1=1+exp()1=1+1=0

Let's assume that wx+b>0:

clim1+exp(c(wxb))1==1+exp(limcc(wxb))1=1+exp()1=1+01=1

This is exactly the behavior we would expect with a perceptron!

How Can This Fail?

If wx+b=0 is true then:

clim1+exp(c(wxb))1==1+exp(limcc(0))1=1+exp(0)1=1+11=21

limc1+exp(c(wxb))1=21=0

This does not match the expected behavior of a perceptron.