2

I have thought of a regression technique that I want to try on several datasets. I would like these datasets to have the following properties:

  • Be a tabular dataset (no images).
  • Have at least 20k rows, and ideally around 100k.
  • Have some categorical variables with many levels (at least a variable with 100 levels or more).
  • Ideally, the target should have long tails.

Does anyone any public dataset with these properties? I have found the stack overflow developer survey to work for me, but I'd like to have some more datasets with such structure.

David Masip
  • 5,981
  • 2
  • 23
  • 61

0 Answers0