Dropout backpropagation implementation details

Question

Just to summarize Understanding dropout and gradient descent and https://stats.stackexchange.com/questions/207481/dropout-backpropagation-implementation

Suppose I need to implement inverted dropout in my CNN. All the neuron outputs in dropout layer during feedforward phase are multiplied by mask/p, where mask is 0 or 1, p is retain rate. But should I apply the same operation (include division by p) at the backpropagation phase? I suppose positive answer (see the second link above), but I need to know for sure.

score 3 · Answer 1 · answered Apr 15 '17 at 23:54

3

As given in the links, the answer is yes! note that you divide the mask by p so that you won't need to multiply by p in the test time and since this is a coefficient for the new activation, it will come out of the derivative in chain rule in backprop.

answered Apr 15 '17 at 23:54

Ehsan M. Kermani

253
2
7

Dropout backpropagation implementation details

1 Answers1