29

What is the difference between fit() and fit_generator() in Keras?

When should I use fit() vs fit_generator()?

Stephen Rauch
  • 1,783
  • 11
  • 21
  • 34
N.IT
  • 1,975
  • 4
  • 17
  • 35

2 Answers2

27

In keras, fit() is much similar to sklearn's fit method, where you pass array of features as x values and target as y values. You pass your whole dataset at once in fit method. Also, use it if you can load whole data into your memory (small dataset).

In fit_generator(), you don't pass the x and y directly, instead they come from a generator. As it is written in keras documentation, generator is used when you want to avoid duplicate data when using multiprocessing. This is for practical purpose, when you have large dataset.

Here is a link to understand more about this-

A thing you should know about Keras if you plan to train a deep learning model on a large dataset

For reference you can check this book- https://github.com/hktxt/bookshelf/blob/master/Computer%20Science/Deep%20Learning%20with%20Python%2C%20Fran%C3%A7ois%20Chollet.pdf

Ankit Seth
  • 1,801
  • 12
  • 27
  • Hi Ankit, your link http://www.deeplearningitalia.com/wp-content/uploads/2017/12/Dropbox_Chollet.pdf doesn’t work . Do you have a working link. – Chidu Murthy Dec 18 '18 at 19:30
  • @ChiduMurthy Thanks for the info. I have edited the link. – Ankit Seth Dec 19 '18 at 14:36
  • According to documentation, we can also pass generators to fit method. So I still don't understand why we need a separate fit_generator method? https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit – alyaxey Oct 27 '19 at 12:08
  • 1
    `.fit_generator` method is currently in the process of deprecation. – Naveen Reddy Marthala Jul 04 '20 at 05:14
3

There is more to the difference between Keras fit and fit.generator than meets the eye. I had a dataset who was perfectly been learned by the model using fit.generator. As the dataset wasn't too big I decided to change to fit instead of fit.generator. To my surprise the learning curve was all over the place. Had to start tuning up from scratch. Guess the way gradients are updated in each function differs quite significantly. Beware.

agcala
  • 157
  • 4