0

I have a dataset like the following one:

enter image description here

Each column is a different numerical feature. Each row represents a timestamp. I want to create an LSTM model that can make prediction of the future time-steps for all the features. For example, I want to use the first 2000 examples to train my model and use the next 1000 to test it. The problem is that I do not know how to proceed.

Since we do not have y's value in this dataset, I was thinking of creating them by shifting the time t+1 to t has a new column y. I explain myself: for example I will have a new column 14, with the value 996.52 for timestamp 0, which is the value at time-step 1 for the feature 0. And so on for all the time-steps and all the features.

The problem is after that I do not know how to feed my LSTM using Keras to make several steps predictions with such a dataset.

Stephen Rauch
  • 1,783
  • 11
  • 21
  • 34
Alex
  • 11
  • 1
  • 2

1 Answers1

4

LSTM inputs should be a tensor of size (#samples, time_steps, #features). Considering your dataset; you will have #samples = 2000-1 (for the training); time_steps = 1 (this is the lag-time you choose), and #features = 14. Then for the testing/validation #samples = 1000-1. But here, I suggest you take due consideration to the autocorrelation of the dataset; it seems that you choose the lag-time = 1 (just for the sake of creating the response variables); however, check out the temporal correlation + temporal structure of the data. Similar, to other Deep Neural networks, LSTM requires large dataset to train and test; checkout if you can increase the lag-time and get more predictor data. Have a look at Multi-dimentional and multivariate Time-Series forecast (RNN/LSTM) Keras this similar question for more information.

Yabelo
  • 41
  • 2