Tips for scraping crypto data in the right way

Question

I am scraping data from crypto site and want to use neural network algorithm for predicting data. the way i save data is like these:

and there is bunch of other features like open/high/low/close for each coin. the data scrap a crypto site in a specific intervals and store them like the picture above.
I want to train LSTP model on them when I gather enough data and my question is should I store data for each coin seperately or just put the data in a file like the way I store them in the picture above is fine?
regarding these, a broader question is that can we train a single network for whole crypto market? or in the way we save data for each coin seperately, we should train a network for each one seperately too.

Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. — Community, Jun 20 '23 at 13:46

score 1 · Answer 1 · answered Jun 21 '23 at 03:59

First of all, if you are trying to predict the price of the cryptocurrency in next days or hours or weeks etc. for a particular coin/crypto, then you need to used separate those columns. Each crypto has its own market cap, its own range of price etc. And I never heard about LSTP model, but I am guessing that you are talking about LSTM model (Just a guess, there might be LSTP that I don't have knowledge about).
Second, for the LSTM model, you should know that it's a sequential model, and it needs a lot of data. Even though you give it enough data, still it has many issues like vanishing gradient. But according to me, it could be a great start for you if you do it with LSTM.
About your last question, I will suggest you to train different LSTM models on each crypto separately. Otherwise, the model will give you low score as it won't learn from the whole data a lot but if you feed the data for every crypto separately then the model can be reliable.

Lucas Morin · Answer 2 · 2023-06-22T14:32:39.263

Regarding the Machine Learning approach. I would suggest to look at the kaggle G-research competition https://www.kaggle.com/competitions/g-research-crypto-forecasting

It will help about data engineering, target engineering and modelling.

Some take away:

Binance offer a good API that is helpfull to get price.
The ML approach will try to estimate the return over a given period of time. It doesn't give you other significant stuff (how to select time horizon ? how to deal with fees, how to measure performance ?)
it will give you some feature engineering approach (work with returns, GK volatility estimation)
the general approach weren't time series model (LSTM) but feature engineering (using lagged features) and tabular model (vanilla MLP / gbdt).
regarding the discussion about using one model versus many, it wasnt entirely clear. Some used a general model, some used a generel model + a dummy variable for each coin, some used different models for each coin. In the end I remember people ensembling all those approaches. The data format might depend on what model you plan to use. If you go the tabular way i would suggest one big file (or one big file per fold).

The problem with ML is that it won't answer very practical questions; how long do you hold ? how do you balance portfolio given predictions ? how do you account for fees ?

Tips for scraping crypto data in the right way

2 Answers2