1

I am new in machine learning and just learned about feature selection. In my project, I have a dataset with 89% being a majority class and 11% as the minority class. Also, I have 24 features. I opted to use Recursive Feature Elimination with Cross-Validation (RFECV in the scikit-learn package) to find the optimal number of features in the dataset. I also set the 'scoring' parameter to 'f1' since I am dealing with an imbalanced dataset. Furthermore, the estimator I used is the Random Forest classifier. After fitting the data, I had around 12 features with an f1 score of 0.94.

Is using RFECV appropriate for imbalanced datasets?

Ethan
  • 1,625
  • 8
  • 23
  • 39
laguna
  • 11
  • 1

0 Answers0