Have looked at data on oob but would like to use it as a metric in a grid search on a Random Forest classifier (multiclass) but doesn't seem to be a recognised scorer for the scoring parameter. I do have OoB set to True in the classifier.
Currently using scoring ='accuracy' but would like to change to oob score.
Ideas or comments welcome
-
See also https://datascience.stackexchange.com/a/66238/55122 – Ben Reiniger Oct 02 '21 at 13:24
3 Answers
Check out the make_scorer function in sklearn: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.make_scorer.html
You may need to code the OoB component yourself, I haven't ever used that metric for scoring before with Sklearn, so I don't know for sure.
Also, consider using random search rather than grid search, it is more practical when dealing with a large hyper parameter space, since it does not need to search the entire dimension of a parameters that have no impact.
- 186
- 6
-
Thanks for your response Jinglesting . I did try the make scorer - unsuccessfully but will now try to code the OOB metric . – RDATA Aug 28 '18 at 14:34
-
No problem RDATA, if this answer is correct for you, please don't forget to mark it as so! – Jinglesting Aug 28 '18 at 15:18
Sorry - I know this is on old thread but you can try calling model.oob_score_ after running your model fit. It should give the OOB accuracy
I hope I'm phrasing this correctly.
You can call it as a attribute from the best estimator, i.e.
grid_model.best_estimator_.oob_score_
so you can't really use grid_model.oob_score_ because gridsearchcv does not have such an attribute; but you can call the best instance of the random forest model, then call the attribute.
- 1,783
- 11
- 21
- 34
- 1