I'm currently doing this task from kaggle.
I've normalized my data with the minmax scaler and fixed the dummie variable trap by removing one column to every dummie variable I created.
Here is the first row of my training data:
array([0.45822785, 0.41137515, 0.41953444, 0.01045186, 0.00266027,
0.13333333, 0. , 0.02342393, 0.62156863, 0.16778523,
0.09375 , 0. , 1. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 1. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. ,
0. , 0. , 0. , 1. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 0. ,
0. , 0. , 1. , 0. , 0. ,
0. , 0. , 1. , 1. , 0. ])
This is the model I'm using:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation,Dropout
from tensorflow.keras import optimizers
from tensorflow.keras.callbacks import EarlyStopping
early_stop = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=25)
optimizer = optimizers.Adam(clipvalue=1)
model = Sequential()
# https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw
model.add(Dense(units=85,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(units=42,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(units=42,activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(units=1,activation='sigmoid'))
# For a binary classification problem
model.compile(loss='binary_crossentropy', optimizer=optimizer)
model.fit(x=X_train,
y=y_train,
epochs=600,
validation_data=(X_test, y_test), verbose=1,
callbacks=[early_stop]
)
As stated, I'm using a clipvalue of 1 with the adam optimizer which was what was recommended on this and this posts.
Despite that, the output I'm getting is the following:
Epoch 1/600
8664/8664 [==============================] - 7s 810us/step - loss: nan - val_loss: nan
Epoch 2/600
8664/8664 [==============================] - 7s 819us/step - loss: nan - val_loss: nan
Epoch 3/600
8664/8664 [==============================] - 7s 818us/step - loss: nan - val_loss: nan
Epoch 4/600
8664/8664 [==============================] - 7s 824us/step - loss: nan - val_loss: nan
Epoch 5/600
8664/8664 [==============================] - 7s 805us/step - loss: nan - val_loss: nan
Epoch 6/600
8664/8664 [==============================] - 7s 800us/step - loss: nan - val_loss: nan
Epoch 7/600
8664/8664 [==============================] - 7s 833us/step - loss: nan - val_loss: nan
Epoch 8/600
8664/8664 [==============================] - 7s 821us/step - loss: nan - val_loss: nan
Epoch 9/600
8664/8664 [==============================] - 7s 806us/step - loss: nan - val_loss: nan
Epoch 10/600
8664/8664 [==============================] - 7s 813us/step - loss: nan - val_loss: nan