1

I am trying to predict some time series based on precedent values using LSTM.

I have pretty good results when I compare the predicted time series with the test set (0,18% error)

I just miss how to forecast outside the interval of data ^^'

I have to admit that I used a point by point prediction method that looks like this:

def predict_point_by_point(model, data): 
    predicted = model.predict(data)
    predicted = np.reshape(predicted, (predicted.size))
    return predicted

I then, I used it to override the predict function. maybe the original function could have nailed the prediction to have a future time series? maybe the point by point isn't that bad neither?

I mean; how could I predict, some precise interval of time series (3months for example) without just reffering to the test set?

Example: the test set starts 01/01/2018 and ends 01/12/2018 and I want to predict 4 months from 02/12/2018

Thanks in advance for your help

Nour
  • 35
  • 5
  • Does your model only take the time as input or also other variables that vary over time? If you solely predict on time (and it's progression) you can simply create a new data set like the test set with forward looking time points instead of backwards and predict on that (of course you won't have a test label to compare but that is no problem). If you have more input variables you need a way to forecast or impute these because to make a prediction your model needs all inputs that build the model. – Fnguyen Sep 26 '19 at 11:05
  • Thank you, the model only takes the variation of an indice over time & predicts the future variation. I'll try to follow your advice. When you say 'forward looking time points', will the data set be empty? – Nour Sep 26 '19 at 11:19
  • @Fnguyen , what I want actually, is instead of putting test in the predict function: ```model.predict([x_test])```, wich will give me a time series in the same time interval of x_test, I use an other approch to have a **future** time series Thanks a lot – Nour Sep 26 '19 at 12:31
  • to predict values you need a data set, let's call it "new" that is identical to your training or test set in terms of columns, data, etc. the only difference being that training/test contained data up until 01/12/2018 and "new" contains data starting from 02/12/2018 with only the label, that is to be predicted missing. – Fnguyen Sep 26 '19 at 12:33
  • so I heve to fill the date column ? – Nour Sep 26 '19 at 12:35
  • the date column and any other column used as a predictor in the original model. I'll post an answer that is hopefully clearer with a example data. – Fnguyen Sep 26 '19 at 12:36

1 Answers1

0

Let's say you trained a forecasting model using the following base data:

time       | index_1 | index_2 | label
01/01/2018 | 80      | 70      | 1
01/02/2018 | 60      | 30      | 0
01/03/2018 | 75      | 90      | 1

You used time, index_1 and index_2 to predict label. Then you would simply need a dataset like this to predict 01/04/2018:

time       | index_1 | index_2
01/04/2018 | 60      | 75      

Using your model on this data set should predict the label-value.

Now in a time series this can be more complicated, let's say what you actually want to predict is the label-value of time X from the indicie values of time X-4 months. In this case your data to build the model should look like this:

time       | index_1_lag_4_months | index_2_lag_4_months | label
01/04/2018 | 80                   | 70                   | 1
01/05/2018 | 60                   | 30                   | 0
01/06/2018 | 75                   | 90                   | 1

This model would predict the label-value for 01/04/2018 based on the indice values of 01/01/2018. To actually get a prediction we again give a data set like this:

time       | index_1 | index_2
01/04/2018 | 60      | 75      

Only the output would not be the label-value for 01/04/2018 but instead for 01/08/2018.

Fnguyen
  • 1,723
  • 5
  • 15
  • What I actually I want to predict is the indices of 01/04/2018 and after . Can I do this? ``` time | indice 01/04/2018 | 0 01/05/2018 | 0 01/06/2018 | 0 01/07/2018 | 0 ``` – Nour Sep 26 '19 at 12:50
  • @Nour as I pointed out in my example it depends how you build your model. Does it predict the label value for time X based on the indices in time X or does it predict based on the lagged indices of time X-4? – Fnguyen Sep 26 '19 at 12:52
  • it does predict based on the time X-4, and it does not predict labels but values between 0 and 1 (vegetation indices) – Nour Sep 26 '19 at 12:54
  • @Nour in this case a data set like in my last example should be the way to go. Given a data set of time/index_1/... and no vegetation variable predict should forecast the vegetation four months in advance. – Fnguyen Sep 26 '19 at 13:02
  • great thanks for your priceless help =) – Nour Sep 26 '19 at 13:06