2

I have been reading through Stanford's code examples for their Deep Learning course, and I see that they have computed num_steps = (params.train_size + params.batch_size - 1) // params.batch_size [github link].

Why isn't it num_steps = params.train_size // params.batch_size instead?

Henry
  • 53
  • 4

1 Answers1

2

The double-slash in python stands for “floor” division (rounds down to nearest whole number), so if the result is not an integer, it will always miss the last batch, which is smaller than the batch size.

For example:

Given a dataset of 10,000 samples and batch size of 15:

# Original calculation:
# (10000 + 15 - 1) // 15 = floor(667.6) = 667

# Your calculation:
# 10000 // 15 = floor(666.667) = 666

Your formula calculates number of full batches, theirs total number of batches.

Mark.F
  • 2,180
  • 2
  • 16
  • 25
  • Thanks Mark! Really appreciate your help. I unfortunately don't have enough rep to upvote. – Henry Feb 13 '19 at 14:07