I was reading scikit-learn documentation for Extremely Randomized Trees and I found this warning:
Warning: Extra-trees should only be used within ensemble methods.
Why is that?
I was reading scikit-learn documentation for Extremely Randomized Trees and I found this warning:
Warning: Extra-trees should only be used within ensemble methods.
Why is that?
In a random forest tree, a random subset of features is available for consideration at each split. Extra-trees takes this a step further by using a random threshold at each split. The idea is that a forest (an ensemble of trees) with a large number of trees that have learned something different from each other (due to the random nature of the features and thresholds at each split) will result in a better prediction than any single tree.
If one were to use a random threshold for each split (what extra-trees do) in a single tree, one would likely end up with a suboptimal tree that predicts poorly.