1

I am training an XGBoost model, xgbr, using xgb.XGBRegressor() with 13 features and one numeric target. The R2 on the test set is 0.935, which is good. I am checking the feature importance by

for col,score in zip(X_train.columns,xgbr.feature_importances_):
    print(col,score)

When I check the importance type by xgbr.importance_type, the result is gain.

I have a feature, x1, whose importance seems to be 0.0068, not so high. x1 is a categorical feature with a cardinality of 5122, and I apply LabelEncoder before training the model.

I remove this feature from training set, and retrain the model with the same hyperparameters and the same training-testing set. The R2 seems to have a big hit and falls down to 0.885.

Why does a seemingly unimportant feature have such a big impact?

volkan g
  • 21
  • 1

1 Answers1

0

With no more context, my main guess is that it is an effectt not of the features but the metric you are using. Remember that $R^2$ is nondecreasing, so it will be greater or equal as you add more predictors

I can recommend to use another metric like mean squared error and repeat the model evaluation.

For reference check: https://stats.stackexchange.com/questions/133089/why-does-adding-more-terms-into-a-linear-model-always-increase-the-r-squared-val

Multivac
  • 2,784
  • 2
  • 8
  • 26
  • 2
    The question specifies "R2 on the test set", which doesn't need to be monotonic. And R2 is just a scaled mse, so switching to mse won't change the comparison. – Ben Reiniger Jan 15 '22 at 05:12