I am trying to predict future stock market values using a gradient boosted tree model. As far as I know, gradient boosted trees use the data in one row, and only that row, to predict the target variable for that row.
Therefore, I am thinking that setting up the training dataset like this would not cause data leakage?
Would this count as roll-forward partitioning in some sense, because for each row, the last year's worth of historical values are provided?
