2

Extreme Gradient Boosting stops to grow a tree if $\gamma$ is greater than impurity reduction given as eq (7) (see below) , what does happen if tree's root has a negative impurity? I think there is no any way to boosting goes on because the next trees would depend of the tree that would've grown by this removed root.

enter image description here

1 Answers1

2

You'll be left with one-node trees. The loss reduction of a split is penalized by $\gamma$, but the root itself does not get pruned. This is fairly easy to test:

import xgboost as xgb
import numpy as np
from sklearn.datasets import load_boston

X, y = load_boston(return_X_y=True)
model = xgb.XGBRegressor(gamma=1e12)  # outrageously large gamma

model.fit(X, y)

# model makes a single prediction for everything:
print(np.unique(model.predict(X)))
# out: [22.532211]
print(y.mean())
# out: 22.532806324110677

# Check out the trees more directly:
model.get_booster().trees_to_dataframe()
# out: frame, one row per tree; just the root node, which is a leaf

(The slight discrepancy between y.mean() and the single prediction is from the single-leaf trees being shrunk by the learning rate; we are converging to the average.)

Ben Reiniger
  • 11,094
  • 3
  • 16
  • 53
  • you're right, the split criterion wouldn't allow the root has any child nodes because it's a set of all data observations so there wouldn't be any criterion to have got root node in hands. I'd got confused due to stats quests lesson have pruned root. – Davi Américo Jun 26 '21 at 00:19