5

I have a few basic questions about tracking losses during training.

  1. If I am using mini-batch training, should I validate after each batch update or after I have seen the entire dataset?
  2. What should be the condition to stop the training to prevent overfitting? Do you save the model at that point?
  3. In case I use mini-batch training losses fluctuate a lot, depending on the random choice of training data, and sometimes validation loss is less than training loss. Is this normal? I think my confusion about this point will be answered in 1 by itself.
David Masip
  • 5,981
  • 2
  • 23
  • 61
pg2455
  • 213
  • 2
  • 6

1 Answers1

2
  1. Both approaches can be done: I recommend to validate after every batch when your are just playing around with your gradient method, to see if the validation accuracy goes up or down and to figure out how everything is going.

  2. In this setting, you can adopt early stopping, although there are many ways to prevent from overfitting. Early stopping checks, at the end of an epoch, your validation accuracy, and saves the model if it is the best so far. If the validation accuracy does not increase in the next $n$ epochs (and here $n$ is a parameter that you can decide), then you keep the last model you saved and stop your gradient method.

  3. Validation loss can be lower than training loss, this happens sometimes. In this case, you can state that you are not overfitting.

David Masip
  • 5,981
  • 2
  • 23
  • 61