XGBRegressor vs. xgboost.train huge speed difference?

Question

If I train my model using the following code:

import xgboost as xg
params = {'max_depth':3,
'min_child_weight':10,
'learning_rate':0.3,
'subsample':0.5,
'colsample_bytree':0.6,
'obj':'reg:linear',
'n_estimators':1000,
'eta':0.3}

features = df[feature_columns]
target = df[target_columns]
dmatrix = xg.DMatrix(features.values,
                     target.values,
                     feature_names=features.columns.values)
clf = xg.train(params, dmatrix)

it finishes in about 1 minute.

If I train my model using the Sci-Kit learn method:

import xgboost as xg
max_depth = 3
min_child_weight = 10
subsample = 0.5
colsample_bytree = 0.6
objective = 'reg:linear'
num_estimators = 1000
learning_rate = 0.3

features = df[feature_columns]
target = df[target_columns]
clf = xg.XGBRegressor(max_depth=max_depth,
                min_child_weight=min_child_weight,
                subsample=subsample,
                colsample_bytree=colsample_bytree,
                objective=objective,
                n_estimators=num_estimators,
                learning_rate=learning_rate)
clf.fit(features, target)

it takes over 30 minutes.

I would think the underlying code is nearly exactly the same (i.e. XGBRegressor calls xg.train) - what's going on here?

score 33 · Accepted Answer · answered Mar 02 '17 at 02:00

xgboost.train will ignore parameter n_estimators, while xgboost.XGBRegressor accepts. In xgboost.train, boosting iterations (i.e. n_estimators) is controlled by num_boost_round(default: 10)

In your case, the first code will do 10 iterations (by default), but the second one will do 1000 iterations. There won't be any big difference if you try to change clf = xg.train(params, dmatrix) into clf = xg.train(params, dmatrix, 1000),

References

http://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.train

http://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.XGBRegressor

XGBRegressor vs. xgboost.train huge speed difference?

1 Answers1

References

Linked