Dimension of weight matrix in neural network

Question

Why would the dimension of $w^{[2]}$ be $(n^{[2]}, n^{[1]})$ ?

This is a simple linear equation, $z^{[n]}= W^{[n]}a^{[n-1]} + b^{[n]}$

There seems to be an error in the screenshot. the weight, $W$ should be transposed, please correct me if I am wrong.

$W^{[2]}$ are the weights assigned to the neurons in the layer 2

$n^{[1]}$ is the number of neurons in layer 1

Screenshot from Andrew Ng deeplearning coursera course video:

score 6 · Accepted Answer · answered Oct 02 '17 at 11:05

6

There seems to be an error in the screenshot. The weight, $W$ should be transposed, please correct me if I am wrong.

You are wrong.

Matrix multiplication works so that if you multiply two matrices together, $C = AB$, where $A$ is an $i \times j$ matrix and $B$ is a $j \times k$ matrix, then C will be a $i \times k$ matrix. Note that $A$'s column count must equal $B$'s row count ($j$).

In the neural network, $a^{[1]}$ is a $n^{[1]} \times 1$ matrix (column vector), and $z^{[2]}$ needs to be a $n^{[2]} \times 1$ matrix, to match number of neurons.

Therefore $W^{[2]}$ has to have dimensions $n^{[2]} \times n^{[1]}$ in order to generate an $n^{[2]} \times 1$ matrix from $W^{[2]}a^{[1]}$

answered Oct 02 '17 at 11:05

Neil Slater

28,338
4
77
100

Neil, would you mind taking another look at https://datascience.stackexchange.com/questions/23486/proper-derivation-of-dz1-expression-for-backpropagation-algorithm ? – kevin Oct 04 '17 at 13:52
It's helpful to think of the weight matrix, W, as an adjacency matrix for a directed graph between layers. Therefore as @Neil Slater says, its a n[next layer] X n[current layer] matrix. – steviesh May 04 '18 at 21:29

Dimension of weight matrix in neural network

1 Answers1