6

Since the first Epoch of the RNN, the loss value is being outputted as nan.

Epoch 1/100 9787/9787 [==============================] - 22s 2ms/step - loss: nan

I have normalized the data.

    ...,
    [9.78344703e-01],
    [1.00000000e+00],
    [9.94293976e-01]]])

Example of my X_train (float64 of size (9787,60,1))

-

array([6.59848480e-04, 6.98212803e-04, 6.90540626e-04, ...,
   1.00000000e+00, 9.94293976e-01, 9.95909540e-01])

Example of my y_train (float64 of size (9787,))

My RNN:

# Initialising the RNN
regressor = Sequential()

# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True, input_shape =        
(X_train.shape[1], 1)))
regressor.add(Dropout(0.2))

# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

# Adding the output layer
regressor.add(Dense(units = 1))

# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fitting the RNN to the Training set
regressor.fit(X_train, y_train, epochs = 100, batch_size = 32)
Erik Dz
  • 73
  • 1
  • 1
  • 4

2 Answers2

4

It could possibly be caused by exploding gradients, try using gradient clipping to see if the loss is still displayed as nan. For example:

from keras import optimizers

optimizer = optimizers.Adam(clipvalue=0.5)
regressor.compile(optimizer=optimizer, loss='mean_squared_error')
Oxbowerce
  • 7,077
  • 2
  • 8
  • 22
  • That was exactly the problem. Exlpoding gradient. This happened because I normalized the values with the MinMax formula. My Min value was 0.437 and my biggest 478.23 so the values at the beginning were negligible. – Erik Dz Jan 26 '20 at 16:51
  • this solved my problem, i changed the learning rate , without clipping the values and it worked for me. your answer is a good guide, exploding gradients – Kali Kimanzi Sep 28 '21 at 07:50
1

There might be an nan value present in your dataset somewhere. I ran the code above on another dataset and it executed without issue.

That said, I did not specify the input shape in the first layer - instead doing so before initializing the RNN.

Check to see if your dataset has any errors, but the below amendment is also something you might consider.

# reshape input to be [samples, time steps, features]
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))

# Initialising the RNN
regressor = tf.keras.Sequential()

# Adding the first LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a second LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a third LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))

# Adding a fourth LSTM layer and some Dropout regularisation
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))

# Adding the output layer
regressor.add(Dense(units = 1))

# Compiling the RNN
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')

# Fitting the RNN to the Training set
regressor.fit(X_train, Y_train, epochs = 100, batch_size = 32)
Michael Grogan
  • 482
  • 2
  • 7