Regularization practice with ANNs

Question

I have learnt from some examples the existence of regularization option at ANNs (concretely, at Keras implementation). As far as I know, regularization in general is a kind of "penalty" on parameters to prevent model complexity and overfitting.

Accordingly, W_regularizer and b_regularizer options in Keras are for weight and bias parameter regularization, unless I am mistaken. But what is activity_regularizer for? How is it related to the weight/bias regularization? And more generally: what is a good practice to using all these regularization possibilites (apart from the blind brute force tuning)? Because of ANNs/CNNs are produce very low overfitting measured on the validation set, it seems me that regularization is not a really useful tool with neural nets.

score 8 · Answer 1 · answered Nov 17 '16 at 17:43

activity_regularizer are used to control the output of a neural network. They tend to make the output smaller. Suppose the loss function is give as :

loss function = DataLoss + regularizationLoss

Then for weight_regularizer, regularizationLoss = f(Weights in a network). But for activity_regularizer, regularizationLoss = f(Predicted outputs from a network). Activity regularizers are generally used when you are quite aware of the distribution of the test dataset.

For your second argument, I would say you are pretty wrong. ANN as well as CNN both can suffer from overfitting. In order to prevent the model to overfit, we generally use a lot of regularization techniques among which Dropout is quite popular.

activity regularizer can be applied to any layer's output . . . so it works when *not* applied to network output (last layer), but to hidden layers instead. — Neil Slater, Nov 17 '16 at 20:55

Regularization practice with ANNs

1 Answers1