Here are my solutions to
exercise 4.
Implementing Our Network to Classify Digits
Part 1
Question
Write out a′=σ(wa+b) in component form, and verify that it gives the same result as the rule,
1+exp(−∑jwjxj−b)1, for computing the output of a sigmoid neuron.
Solution
Let it be stated that I have not yet taken linear algebra at college, so I have very limited
experience with it.
Let us say that layer 2 has 2 nodes and layer 1 has 3 nodes.
The weights from layer 1 to layer 2 can be expressed as the following (wji, where j is the
neuron in the second layer and i is the neuron in the first layer):
w=[w11w21w12w22w13w23]a=⎣⎢⎡a1a2a3⎦⎥⎤b=[b1b2] wa=[w11a1+w12a2+w13a3w21a1+w22a2+w23a3]wa+b=[(w11a1+w12a2+w13a3)+b1(w21a1+w22a2+w23a3)+b2]a′=σ(wa+b)=[σ((w11a1+w12a2+w13a3)+b1)σ((w21a1+w22a2+w23a3)+b2)] This is exactly the same as 1+exp(−∑jwjxj−b)1 but computed for both neurons
at once with matrices!
Say we wanted to compute the output of the first sigmoid neuron in layer 2.
a1′=σ(j∑wjaj+b)=σ((w1a1+w2a2+w3a3)+b) Also, I wanted to note that I found this website called the
ml cheatsheet and it has been
really useful in describing the mathematic concepts.
The header image was taken from
Khan Academy.