How to compute backpropagation gradient according chain rule for using vector/matrix differential?

Question

I have some problems for computing derivative for sum of squares error in backprop neural network. For example, we have a neural network as in picture. For drawing simplicity, i've dropped the sample indexes.

Сonventions:

x - data_set input.
W - is a weigth matrix.
v - vector of product: W*x.
F - activation function vector.
y - vector of activated data
D - vector of answers
e - error signal
lower index is a variable(NxN) - dimenstionality
higher [index] - is a layer number.

Jacobian is defined as:

\begin{pmatrix}\frac{\partial f_1(x_1)}{\partial x_1}&\dots&\frac{\partial f_1(x_N)}{\partial x_N}\\\vdots&\ddots&\vdots\\\frac{\partial f_M(x_1)}{\partial x_1}&\dots&\frac{\partial f_M(x_N)}{\partial x_N}\end{pmatrix}

Let's have a look at the second layer and let's find derivative according chain rule:

Optimization error is sum of squared errors:

Text