3

I recently realized that keras callback for early stopping returns the last epoch's weights by default. If you want to do otherwise you can use the argument restore_best_weights=True, as stated for example in this answer or documentation.

I'm quite surprised by that, as I would assume one would only be interested in the best model at the end of the training. So why is the default set to restore_best_weights=False? Is there any practical reason that I am missing?

Luca Clissa
  • 135
  • 7
  • 2
    Memory, as you do not need to store all previous n iteration weights if you will return the last one. – user2974951 Jan 20 '23 at 10:20
  • Thanks @user2974951, but I am not sure I get it. I think you would just need to keep the best model in memory (not all previous n iterations). Of course, this may still be too much for some applications but it seems too weak of a reason for a default value. – Luca Clissa Jan 20 '23 at 10:34

1 Answers1

7

There may be two reasons:

  • Memory and speed. This is discussed in this Keras issue (note that there is a similar question in this very site that contains the answer in the comments). This is the key paragraph of the discussion:

    If you want to restore the weights that are giving the best performance, you have to keep tracks on them and thus have to store them. This can be costly as you have to keep another entire model in memory and can make fitting the model slower as well.

  • Backward compatibility. Initially, there was no such flag and the only implemented behavior was not to restore the best weights. When adding the flag, the sensible decision would be to give it a default value that makes the old code keep its behaviour instead of silently changing it. The commit where the flag was introduced is this one. This is only speculation, as I am not the author.

noe
  • 22,074
  • 1
  • 43
  • 70