1

I am scraping data from crypto site and want to use neural network algorithm for predicting data. the way i save data is like these:

enter image description here

and there is bunch of other features like open/high/low/close for each coin. the data scrap a crypto site in a specific intervals and store them like the picture above.
I want to train LSTP model on them when I gather enough data and my question is should I store data for each coin seperately or just put the data in a file like the way I store them in the picture above is fine?
regarding these, a broader question is that can we train a single network for whole crypto market? or in the way we save data for each coin seperately, we should train a network for each one seperately too.

  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Jun 20 '23 at 13:46

2 Answers2

1
  • First of all, if you are trying to predict the price of the cryptocurrency in next days or hours or weeks etc. for a particular coin/crypto, then you need to used separate those columns. Each crypto has its own market cap, its own range of price etc. And I never heard about LSTP model, but I am guessing that you are talking about LSTM model (Just a guess, there might be LSTP that I don't have knowledge about).
  • Second, for the LSTM model, you should know that it's a sequential model, and it needs a lot of data. Even though you give it enough data, still it has many issues like vanishing gradient. But according to me, it could be a great start for you if you do it with LSTM.
  • About your last question, I will suggest you to train different LSTM models on each crypto separately. Otherwise, the model will give you low score as it won't learn from the whole data a lot but if you feed the data for every crypto separately then the model can be reliable.
Harshad Patil
  • 822
  • 1
  • 2
  • 13
1

Regarding the Machine Learning approach. I would suggest to look at the kaggle G-research competition https://www.kaggle.com/competitions/g-research-crypto-forecasting

It will help about data engineering, target engineering and modelling.

Some take away:

  • Binance offer a good API that is helpfull to get price.
  • The ML approach will try to estimate the return over a given period of time. It doesn't give you other significant stuff (how to select time horizon ? how to deal with fees, how to measure performance ?)
  • it will give you some feature engineering approach (work with returns, GK volatility estimation)
  • the general approach weren't time series model (LSTM) but feature engineering (using lagged features) and tabular model (vanilla MLP / gbdt).
  • regarding the discussion about using one model versus many, it wasnt entirely clear. Some used a general model, some used a generel model + a dummy variable for each coin, some used different models for each coin. In the end I remember people ensembling all those approaches. The data format might depend on what model you plan to use. If you go the tabular way i would suggest one big file (or one big file per fold).

The problem with ML is that it won't answer very practical questions; how long do you hold ? how do you balance portfolio given predictions ? how do you account for fees ?

Lucas Morin
  • 2,513
  • 5
  • 19
  • 39