Questions tagged [derivation]

20 questions
3
votes
1 answer

Is it valid to use numpy.gradient to find slope of line as well as slope of curve at any point?

what is the difference between slope of the line and slope of the curve? Is it valid to use numpy.gradient to find the slope of the line and slope of the curve at any point? #slope of line at any point tanθ= y2-y1/x2-x1 #slope of…
star
  • 1,411
  • 7
  • 18
  • 29
3
votes
1 answer

Maximum Entropy Policy Gradient Derivation

I am reading through the paper on Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review by Sergey Levine. I am having a difficulty in understanding this part of the derivation on Maximum Entropy Policy Gradients (Section…
2
votes
0 answers

Deriving vectorized form of linear regression

We first have the weights of a D dimensional vector $w$ and a D dimensional predictor vector $x$, which are all indexed by $j$. There are $N$ observations, all D dimensional. $t$ is our targets, i.e, ground truth values. We then derive the cost…
2
votes
1 answer

1st order Taylor Series derivative calculation for autoregressive model

I wrote a blog post where I calculated the Taylor Series of an autoregressive function. It is not strictly the Taylor Series, but some variant (I guess). I'm mostly concerned about whether the derivatives look okay. I noticed I made a mistake and…
2
votes
1 answer

Doubt in Derivation of Backpropagation

I was going through the derivation of backpropagation algorithm provided in this document (adding just for reference). I have doubt at one specific point in this derivation. The derivation goes as follows: Notation: The subscript $k$ denotes the…
1
vote
1 answer

SVM - Making sense of distance derivation

I am studying the math behind SVM. The following question is about a small but important detail during the SVM derivation. The question Why the distance between the hyperplane $w*x+b=0$ and data point (in vector form) $p$, $d = \frac{w * p +…
Alan Yue
  • 21
  • 2
1
vote
1 answer

How is this score function estimator derived?

In this paper they have this equation, where they use the score function estimator, to estimate the gradient of an expectation. How did they derive this?
1
vote
1 answer

Derivative of Loss wrt bias term

I read this and have an ambiguity. I try to understand well how to calculate the derivative of Loss w.r.t to bias. In this question, we have this definition: np.sum(dz2,axis=0,keepdims=True) Then in Casper's comment, he said that the The derivative…
1
vote
0 answers

back propagation through time derivation issue

I read several posts about BPTT for RNN, but I am actually a bit confused about one step in the derivation. Given $$h_t=f(b+Wh_{t-1}+Ux_t)$$ when we compute $\frac{\partial h_t}{\partial W}$, does anyone know why is it simply $$\frac{\partial…
username123
  • 151
  • 4
1
vote
1 answer

A Derivation in Combinatory Categorial Grammer

I am reading about CCG on page 23 of Speech and Language processing. There is a derivation as follows: (VP/PP)/NP , VP\((VP/PP)/NP) => VP? Can anyone example this please? This make sense if VP\((VP/PP)/NP) is equivalent to (VP\(VP/PP))/NP and…
chikitin
  • 153
  • 6
1
vote
0 answers

How to compute backpropagation gradient according chain rule for using vector/matrix differential?

I have some problems for computing derivative for sum of squares error in backprop neural network. For example, we have a neural network as in picture. For drawing simplicity, i've dropped the sample indexes. Сonventions: x - data_set input. W - is…
1
vote
0 answers

Adding a group specific penalty to binary cross-entropy

I want to implement a custom Keras loss function that consists of plain binary cross-entropy plus a penalty that increases the loss for false negatives from one class (each observation can belong to one of two classes, privileged and unprivileged)…
Tim
  • 11
  • 1
0
votes
1 answer

Loss function for points inside polygon

I am trying to optimize some parameters that used to transform 2d points from a place to another (you may think of that as rotation & translation parameter for simplicity) The parameters are considered optimal if the transformed points lay inside a…
0
votes
1 answer

Problem for a math formula in Weight Uncertainty in Neural Network

I am studying the paper https://arxiv.org/pdf/1505.05424.pdf and there is a formula I don't get page 4: I don't understand how they obtain this formula. Moreover, with chain rule, I get $\frac{\partial f(\mathrm w, \theta)}{\partial\mathrm w} =…
0
votes
0 answers

How to find the derivative of the hidden state of recurrent neural networks?

Recently I am reading the following paper (link) Liu, Sifei, Jinshan Pan, and Ming-Hsuan Yang. “Learning Recursive Filters for Low-Level Vision via a Hybrid Neural Network.” In Computer Vision – ECCV 2016, edited by Bastian Leibe, Jiri Matas, Nicu…
1
2