2

We first have the weights of a D dimensional vector $w$ and a D dimensional predictor vector $x$, which are all indexed by $j$. There are $N$ observations, all D dimensional. $t$ is our targets, i.e, ground truth values. We then derive the cost function as follows: enter image description here

We then compute the partial derivate of $\varepsilon$ with respect to $w_j$: enter image description here

I'm confused as to where the $j'$ is coming from, and what it would represent.

We then write it as: enter image description here

Then, we vectorize it as:

enter image description here

I'm confused as to the derivation of the vectorization of $A$ from $A_{jj'}$ likely because I don't know what $j'$ is. But how would the vectorization go in terms of steps and intuition?

Here is the link to the handout.

Osama Rizwan
  • 203
  • 1
  • 2
  • 7
user2793618
  • 143
  • 4
  • Have you reached out to your professor(s) and/or teaching assistants? Also, do be careful with posting course material as some universities have policies that prohibit unauthorized dissemination of such content. Considering that this is publicly accessible, I assume that linking to this is safe, but figured I'd mention it just in case. – Ben Sep 23 '20 at 00:51

0 Answers0