Matrix Product
Neural Net Problems - Exercise 4
March 26, 2020

Here are my solutions to exercise 4.

Implementing Our Network to Classify Digits

Part 1


Write out $a'=\sigma(wa+b)$ in component form, and verify that it gives the same result as the rule, $\frac{1}{1+\exp(- \sum_j w_j x_j - b)}$, for computing the output of a sigmoid neuron.


Let it be stated that I have not yet taken linear algebra at college, so I have very limited experience with it.

Let us say that layer 2 has $2$ nodes and layer 1 has $3$ nodes.

The weights from layer 1 to layer 2 can be expressed as the following ($w_{ji}$, where $j$ is the neuron in the second layer and $i$ is the neuron in the first layer):

$$ w = \begin{bmatrix} w{11} & w{12} & w{13}\ w{21} & w{22} & w{23} \end{bmatrix}

a = \begin{bmatrix} a_1\ a_2\ a_3 \end{bmatrix}

b = \begin{bmatrix} b_1\ b_2 \end{bmatrix} $$

$$ \begin{gathered} wa = \begin{bmatrix} w{11} a_1 + w{12} a2 + w{13} a3\ w{21} a1 + w{22} a2 + w{23} a3 \end{bmatrix}\ wa + b = \begin{bmatrix} (w{11} a1 + w{12} a2 + w{13} a3) + b_1\ (w{21} a1 + w{22} a2 + w{23} a3) + b_2 \end{bmatrix}\ a' = \sigma(wa + b) = \begin{bmatrix} \sigma((w{11} a1 + w{12} a2 + w{13} a3) + b_1)\ \sigma((w{21} a1 + w{22} a2 + w{23} a_3) + b_2) \end{bmatrix} \end{gathered} $$

This is exactly the same as $\frac{1}{1+\exp(- \sum_j w_j x_j - b)}$ but computed for both neurons at once with matrices!

Say we wanted to compute the output of the first sigmoid neuron in layer 2.

$$ a_1' = \sigma(\sum_j w_j a_j + b) = \sigma((w_1 a_1 + w_2 a_2 + w_3 a_3) + b) $$

Also, I wanted to note that I found this website called the ml cheatsheet and it has been really useful in describing the mathematic concepts.

The header image was taken from Khan Academy.