1

I am using an LSTM model to predict the next measurement of a sensor.

The dataset looks as follows: enter image description here

There are approximately 13000 measurements.

My code for the LSTM looks as follows:

def create_dataset(dataset, look_back=6):
    dataX, dataY = [], []
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back), 0]
        dataX.append(a)
        dataY.append(dataset[i+look_back, 0])
    return numpy.array(dataX), numpy.array(dataY)
-------------------------------------------------------------------------
numpy.random.seed(7)

dataset = new_df.values
dataset = dataset.astype('float32')

scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)

train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
-------------------------------------------------------------------------
look_back = 6
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)

trainX = numpy.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = numpy.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
-------------------------------------------------------------------------
model = Sequential()
model.add(LSTM(2, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')

model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
-------------------------------------------------------------------------
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])

I tried using different learning rates(0.01, 0.001, 0.0001), more layers (1-3) and/or more neurons(1-128), epochs(40-200), I changed the training size (67-80%). I tried changing the activation function on the Dense layer(sigmoid, relu), I used different optimizers(GSD, Adam, Adagrad). I added Dropout layers, but that didn't help in decreasing the RMSE.

The RMSE doesn't drop below 6.5, the lowest value seen with the following parameters:

  • Train size = 0.8,
  • LSTM Layers = 2,
  • Epochs = 100,
  • Neurons = 32,
  • Look back = 10.

If anyone has any advice on how to get the RMSE to a lower value (as close to 1 as possible), I would love to hear it. I can share the dataset if needed for review.

Dominic
  • 11
  • 2

2 Answers2

0

I suggest you to automate the optimal hyperparameters search with a standard hyperparametrization strategy, like bayesian search for instance; Keras offers you this option with Keras tuner as follows (example):

from tensorflow import keras
from kerastuner.tuners import BayesianOptimization

n_input = 6
def build_model(hp):
    model = Sequential()
    model.add(LSTM(units=hp.Int('units',min_value=32,
                                    max_value=512,
                                    step=32), 
               activation='relu', input_shape=(n_input, 1)))
    model.add(Dense(units=hp.Int('units',min_value=32,
                                    max_value=512,
                                    step=32), activation='relu'))
    model.add(Dense(1))
    model.compile(loss='mse', metrics=['mse'], optimizer=keras.optimizers.Adam(
        hp.Choice('learning_rate',
                  values=[1e-2, 1e-3, 1e-4])))

return model

bayesian_opt_tuner = BayesianOptimization(
    build_model,
    objective='mse',
    max_trials=3,
    executions_per_trial=1,
    directory=os.path.normpath('C:/keras_tuning'),
    project_name='kerastuner_bayesian_poc',
    overwrite=True)

bayesian_opt_tuner.search(train_x, train_y,epochs=n_epochs,
     #validation_data=(X_test, y_test)
     validation_split=0.2,verbose=1)


bayes_opt_model_best_model = bayesian_opt_tuner.get_best_models(num_models=1)
model = bayes_opt_model_best_model[0]

You can find a similar topic answered and validated here

German C M
  • 2,674
  • 4
  • 18
  • please explain what is happening in the model with the bayesian optimizer. why are the lstm layers give min and max neuron values – Golden Lion Sep 01 '22 at 14:27
  • the 'units' parameter of the LSTM layer is a hyperparameter to optimize, so you can give a range of values to try with in the process of Bayesian optimization – German C M Sep 01 '22 at 17:59
0

your neuron count was too low. try this model

model = Sequential()
model.add(LSTM(units=50, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')

model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
Golden Lion
  • 151
  • 6