Back to overview
Equations of Backpropagtion
Neural Net Problems - Exercise 6
April 14, 2020

Here are my solutions to exercise 6.

The Backpropagation Algorithm

Part 1 - Backpropagation with a Single Modified Neuron

Question

Suppose we modify a single neuron in a feedforward network so that the output from the neuron is given by f(jwjxj+b)f(\sum_j w_j x_j +b), where ff is some function other than the sigmoid. How should we modify the backpropagation algorithm in this case?

Solution

We first would have to calculate the derviate for the function ff since it is needed for the backpropagation output error vector δL\delta^L and δl\delta^l. But other than that the neural network does not need any tweaking. You may think that it needs tweaking because δjl\delta_j^l is dependent on ff but we have defined δjl=Czjl\delta_j^l = \frac{\partial C}{\partial z_j^l} and δjlCajl\delta_j^l \ne \frac{\partial C}{\partial a_j^l}. We did this because it makes our lives easier for this particular case!

From Michael Nielsen:

You might wonder why the demon is changing the weighted input zjlz_j^l. Surely it'd be more natural to imagine the demon changing the output activation ajla_j^l, with the result that we'd be using Cajl\frac{\partial C}{\partial a_j^l} as our measure of error. In fact, if you do this things work out quite similarly to the discussion below. But it turns out to make the presentation of backpropagation a little more algebraically complicated. So we'll stick with δjl=Czjl\delta_j^l = \frac{\partial C}{\partial z_j^l} as our measure of error.

Part 2 - Backpropagation with Linear Neurons

Question

Suppose we replace the usual non-linear σ\sigma function with σ(z)=z\sigma(z) = z throughout the network. Rewrite the backpropagation algorithm for this case.

Solution

Since σ(z)=z\sigma(z) = z then σ(z)=1\sigma'(z) = 1, so it follows that δL=aCσ(zL)=aC\delta^L = \nabla_a C \circ \sigma'(z^L) = \nabla_a C. Also, δl=((wl+1)Tδl+1)σ(zl)=(wl+1)Tδl+1\delta^l = ((w^{l+1})^T \delta^{l+1}) \circ \sigma'(z^l) = (w^{l+1})^T \delta^{l+1}.