1

I want to implement a neural network from scratch to solve linear regression by using backpropagation. I don't understand how to compute the gradient of the MSE cost function with respect to each weight.

The formula I have say that:

for each weight $w_{jk}^l$, gradient is $\frac{∂C}{∂z_j^l}a_k^{l-1}$, where $j, k$ are the dimensions of the weight matrix $W$,
$l$ is the layer,
$z_j^l = \sum_{j=1}^m w_{jk}^l a_k^{l-1}+b_j^l$,
$m$ is the number of neurons in layer $l-1$,
$C$ is the cost function, in this case MSE,
$a_k^{l-1} $ is the activation function at layer $l-1$,
and for each bias $b_j^l$, gradient is $\frac{∂C}{∂z_j^l}$

Iya Lee
  • 150
  • 8

0 Answers0