8

Let's say that I would like to predict the temperature tomorrow. I could use the approach whereby I train a model based on a time-series dataset collected from a single location (for example, see this excellent walk-through:https://blogs.rstudio.com/tensorflow/posts/2017-12-20-time-series-forecasting-with-recurrent-neural-networks/).

However, let's say that I want to train a model that incorporated time-series from multiple weather recording sites. In this scenario, let us say that the recording observations from different sites are non-overlapping in time. Ideally, there would be a way to train models using non-consecutive observations (e.g. those collected from different sites), that also enabled us to quantify the influence of 'site' on our prediction.

Is this simply a case where one would train independent LSTM models for each site? Or are there alternative approaches whereby the total training data set can come from multiple observation sets (e.g. sites, non-consecutive observation blocks, etc.)?

2 Answers2

3

If you are looking to predict multiple time series (which would be similar in nature, since each weather station in the area would record similar temperatures, even if they are not identical), using a separate LSTM model for each may prove quite time-consuming.

One approach you could take is one suggested in an excellent answer for another question on Cross Validated. Essentially, the author is describing a means for forecasting sales with LSTM whereby the model is trained on a mini-batch (or subset) of one series, and then a new series is selected. In this case, I would understand this to mean that a subset of data is incorporated from weather station 1, then another batch from weather station 2, etc.

This would have the advantage of essentially creating a unified series that takes the characteristics of all weather stations into account - which allows for maximum utilisation of the data, as well as allowing the network to learn patterns from all stations - not just one or a select few.

One option is to try this method using a subset of data for select weather stations at first, and then compare the prediction accuracy to those that use separate LSTM models for each. If you find that the predictions for the former method are more accurate, then it would make sense to go with that.

If you are using R, then there is also the option of using a package like forecastML to do so - if you are not tied down to using an LSTM model then this could also be another consideration.

Michael Grogan
  • 482
  • 2
  • 7
  • Thank you, both links are helpful. My read on the first link/suggestion is the same as yours---though I am not sure how one would combine observation subsets (mini-batches) from multiple stations/agents without insinuating continuity for the model. That is, how would you combine _a1, a2, a3_ with _b1, b2, b3_ into some sort of composite training set without creating a sequence _a1, a2, a3, b1, b2, b3_ that suggests _b1_ follows _a1_? Maybe I'm just not thinking of this the correct way. – CharismaticChromoFauna Jan 30 '20 at 03:06
  • Regardless, the second link employing multi-step-ahead forecasting seems really promising, and includes an approach to code the different 'origins'/'groups' from which the sequence data are derived. That may end up being the 'winner', though I'll wait to see if folks have any additional suggestions! Thanks again! – CharismaticChromoFauna Jan 30 '20 at 03:31
0

You can forecast using spatio-temporal data by combining Graph Convolution Networks with LSTM models!

The idea comes from a paper by Zhao et al. called "T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction" (https://ieeexplore.ieee.org/document/8809901), and is implemented in the StellarGraph module in Python (https://stellargraph.readthedocs.io/en/stable/demos/time-series/gcn-lstm-time-series.html).

The StellarGraph link above takes you to a demo, making the implementation relatively straightforward

Ben
  • 2,512
  • 3
  • 14
  • 28