Loss is Nan even with clipvalue set and Adam optimizer

Question

I'm currently doing this task from kaggle.

I've normalized my data with the minmax scaler and fixed the dummie variable trap by removing one column to every dummie variable I created.

Here is the first row of my training data:

array([0.45822785, 0.41137515, 0.41953444, 0.01045186, 0.00266027,
       0.13333333, 0.        , 0.02342393, 0.62156863, 0.16778523,
       0.09375   , 0.        , 1.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 1.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 1.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 1.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 1.        , 0.        , 0.        ,
       0.        , 0.        , 1.        , 1.        , 0.        ])

This is the model I'm using:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation,Dropout
from tensorflow.keras import optimizers
from tensorflow.keras.callbacks import EarlyStopping

early_stop = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=25)
optimizer = optimizers.Adam(clipvalue=1)


model = Sequential()

# https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw

model.add(Dense(units=85,activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(units=42,activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(units=42,activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(units=1,activation='sigmoid'))

# For a binary classification problem
model.compile(loss='binary_crossentropy', optimizer=optimizer)

model.fit(x=X_train, 
          y=y_train, 
          epochs=600,
          validation_data=(X_test, y_test), verbose=1,
          callbacks=[early_stop]
          )

As stated, I'm using a clipvalue of 1 with the adam optimizer which was what was recommended on this and this posts.

Despite that, the output I'm getting is the following:

Epoch 1/600
8664/8664 [==============================] - 7s 810us/step - loss: nan - val_loss: nan
Epoch 2/600
8664/8664 [==============================] - 7s 819us/step - loss: nan - val_loss: nan
Epoch 3/600
8664/8664 [==============================] - 7s 818us/step - loss: nan - val_loss: nan
Epoch 4/600
8664/8664 [==============================] - 7s 824us/step - loss: nan - val_loss: nan
Epoch 5/600
8664/8664 [==============================] - 7s 805us/step - loss: nan - val_loss: nan
Epoch 6/600
8664/8664 [==============================] - 7s 800us/step - loss: nan - val_loss: nan
Epoch 7/600
8664/8664 [==============================] - 7s 833us/step - loss: nan - val_loss: nan
Epoch 8/600
8664/8664 [==============================] - 7s 821us/step - loss: nan - val_loss: nan
Epoch 9/600
8664/8664 [==============================] - 7s 806us/step - loss: nan - val_loss: nan
Epoch 10/600
8664/8664 [==============================] - 7s 813us/step - loss: nan - val_loss: nan

I've removed all rows with null values on the preprocessing phase. — Erik Dz, Jul 03 '21 at 14:57

score 0 · Accepted Answer · answered Jul 03 '21 at 15:17

0

It ended up being that I didn't save my changes upon removing NaN values...

Changed this:

df.dropna()

To this:

df = df.dropna()

answered Jul 03 '21 at 15:17

Erik Dz

73
1
1
4

You can also do `df.dropna(inplace=True)` – problemofficer - n.f. Monica Sep 27 '21 at 16:37

Loss is Nan even with clipvalue set and Adam optimizer

1 Answers1