How to choose appropriate epsilon value while approximating gradients to check training?

Question

While approximating gradients, using actual epsilon to shift the weights results in wildly big gradient approximations, as the "width" of the used approximation triangle is disporportionately small. In Andrew NG-s course, he is using 0.01, but I suppose it's for example purposes only.

This makes me wonder, is there a method to chose the appropriate epsilon value for gradient approximation based on e.g. the current error value of the network?

score 1 · Accepted Answer · answered Apr 24 '22 at 20:50

1

It sounds like the epsilon value is a hyperparameter and the error value is an evaluation metric. Given that, cross-validation can be used to find the epsilon value than minimizes the error value.

answered Apr 24 '22 at 20:50

Brian Spiering

20,142
2
25
102

How to choose appropriate epsilon value while approximating gradients to check training?

1 Answers1