What is the best way to categorize the approaches which have been developed to deal with imbalance class problem?
This article categorizes them into:
- Preprocessing: includes oversampling, undersampling and hybrid methods,
- Cost-sensitive learning: includes direct methods and meta-learning which the latter further divides into thresholding and sampling,
- Ensemble techniques: includes cost-sensitive ensembles and data preprocessing in conjunction with ensemble learning.
The second classification:
- Data Pre-processing: includes distribution change and weighting the data space. One-class learning is considered as distribution change.
- Special-purpose Learning Methods
- Prediction Post-processing: includes threshold method and cost-sensitive post-processing
- Hybrid Methods:
The third article:
- Data-level methods
- Algorithm-level methods
- Hybrid methods
The last classification also considers output adjustment as an independent approach.
Thanks in advance.