1

in my project I have >900 features and I thought to use Recursive Feature Elimination algorithm to reduce the dimensionality of my problem (in order to improve the accuracy).

But I can't figure out how to choose the RFE parameters (estimator and the number of parameters to select).

Should I use model selection techniques in this case as well? Do you have any advice?

1 Answers1

1

Practically there are two options for feature selection:

  • Select the number of features and other parameters arbitrarily and/or based on external constraints. It's common to choose these parameters simply based on what the RAM memory can handle, and how long the training process might take.
  • Run a full hyper-parameter tuning process: try many different values by training a model in every case, then evaluate on a test set or using cross-validation. At the end of the process the parameters which achieved the high performance are picked, the final model is re-trained and then evaluated on a different (fresh) test set.

Needless to say, the second option requires a lot more time and/or computing power. Keep in mind that as soon as multiple trials are made with different parameters, in theory this requires a fresh test set for the final evaluation.

Erwan
  • 24,823
  • 3
  • 13
  • 34