Here are my solutions to exercise 4.

## Implementing Our Network to Classify Digits

### Part 1

#### Question

Write out $a'=\sigma(wa+b)$ in component form, and verify that it gives the same result as the rule, $\frac{1}{1+\exp(- \sum_j w_j x_j - b)}$, for computing the output of a sigmoid neuron.

#### Solution

Let it be stated that I have not yet taken linear algebra at college, so I have very limited experience with it.

Let us say that layer 2 has $2$ nodes and layer 1 has $3$ nodes.

The weights from layer 1 to layer 2 can be expressed as the following ($w_{ji}$, where $j$ is the neuron in the second layer and $i$ is the neuron in the first layer):

$$
w = \begin{bmatrix}
w*{11} & w*{12} & w*{13}\
w*{21} & w*{22} & w*{23}
\end{bmatrix}

a = \begin{bmatrix} a_1\ a_2\ a_3 \end{bmatrix}

b = \begin{bmatrix} b_1\ b_2 \end{bmatrix} $$

$$
\begin{gathered}
wa = \begin{bmatrix}
w*{11} a_1 + w*{12} a*2 + w*{13} a*3\
w*{21} a*1 + w*{22} a*2 + w*{23} a*3
\end{bmatrix}\
wa + b = \begin{bmatrix}
(w*{11} a*1 + w*{12} a*2 + w*{13} a*3) + b_1\
(w*{21} a*1 + w*{22} a*2 + w*{23} a*3) + b_2
\end{bmatrix}\
a' = \sigma(wa + b) = \begin{bmatrix}
\sigma((w*{11} a*1 + w*{12} a*2 + w*{13} a*3) + b_1)\
\sigma((w*{21} a*1 + w*{22} a*2 + w*{23} a_3) + b_2)
\end{bmatrix}
\end{gathered}
$$

This is exactly the same as $\frac{1}{1+\exp(- \sum_j w_j x_j - b)}$ but computed for both neurons at once with matrices!

Say we wanted to compute the output of the first sigmoid neuron in layer 2.

$$ a_1' = \sigma(\sum_j w_j a_j + b) = \sigma((w_1 a_1 + w_2 a_2 + w_3 a_3) + b) $$

Also, I wanted to note that I found this website called the ml cheatsheet and it has been really useful in describing the mathematic concepts.

The header image was taken from Khan Academy.