1

I'm studying the performance of an AdaBoost model and I wonder how it performs in regard to the depth of the trees.

Here's the accuracy for the model with a depth of 1

img2

and here with a depth of 3

img

From my point of view, I would say the lower one looks better but somehow I guess the upper one is better as the training accuracy doesn't vanish (overfitting?)? The question resp. answer from Hyperparameter tunning for Random Forest- choose the best max depth underlines my assumption, though.

Ben
  • 510
  • 5
  • 13

1 Answers1

1

The training error shouldn't be too far from test error, otherwise it is a high deviance scenario and you could be in an overfitting situation in production.

However, having a higher deviance could be normal by increasing depth, but it shouldn't happen if you have enough data.

Consequently, if you haven't a lot of data, the depth of 1 seems better, and you should increase the training iterations to lower the error.

In addition to that, there is just a small difference in test results between the depth of 1 and the depth of 3. So, the small benefit of the depth of 3 doesn't worth the risk of having a high deviance scenario. But maybe max depth of 2 is better than 1...

Nicolas Martin
  • 4,509
  • 1
  • 6
  • 15