Does it cause data leakage to train a bidirectional LSTM on data where a user can be a sample in the training data multiple times?
Each row is a snapshot at a different point in time for a given user. Their past N months of behavior are the features and their current month of behavior is the target.
Example data: "Months Prior" columns are the only features. Both the features and target continuous values.
+------------------+---------+--------------------+----------------+----------------+----------------+-----------------+----------------+------------------------+
| Train Test Split | User Id | Current Month Date | 5 Months Prior | 4 Months Prior | 3 Months Prior | 2 Months Prior | 1 Month Prior | Target (Current Month) |
+------------------+---------+--------------------+----------------+----------------+----------------+-----------------+----------------+------------------------+
| test | 123 | June | 1 | 4 | 2 | 8 | 2 | 6 |
| test | 123 | May | 0 | 1 | 4 | 2 | 8 | 2 |
| training | 123 | April | 0 | 0 | 1 | 4 | 2 | 8 |
| training | 123 | March | 0 | 0 | 0 | 1 | 4 | 2 |
| training | 123 | Feb | 0 | 0 | 0 | 0 | 1 | 4 |
+------------------+---------+--------------------+----------------+----------------+----------------+-----------------+----------------+------------------------+
Would the bidirectional LSTM learn that some columns in the training data contain the target for other rows?
Example: April "2 Months Prior" and "1 Month Prior" of, 4 and 2, would have the pattern of March "1 Month Prior" and the Target, 4 and 2.
Intuitively I don't think it would learn these relationships, I believe other machine learning models, like tree models and linear regression. But I don't have enough knowledge on LSTMs to say for sure. I could verify by creating simulated data with a random number generator, but I'd rather understand the math/intuition.