Highest Voted 'catboost' Questions - Data Science Stack Exchange

22

votes

1 answer

Lightgbm vs xgboost vs catboost

I've seen that in Kaggle competitions people are using lightgbms where they used to use xgboost. My question is: when would you rather use xgboost instead of lightgbm? What about catboost?

asked Apr 19 '19 at 06:08

David Masip

5,981
2
23
61

6

votes

1 answer

How to achieve SHAP values for a CatBoost model in R?

I'm asked to create a SHAP analysis in R but I cannot find it how to obtain it for a CatBoost model. I can get the SHAP values of an XGBoost model with shap_values <- shap.values(xgb_model = model, X_train = train_X) but not for CatBoost. Here is…

machine-learning classification r shap catboost

asked Jul 22 '20 at 07:08

user100740

91
2

4

votes

1 answer

Are linear models better when dealing with too many features? If so, why?

I had to build a classification model in order to predict which what would be the user rating by using his/her review. (I was dealing with this dataset: Trip Advisor Hotel Reviews) After some preprocessing, I compared the results of a Logistic…

decision-trees linear-regression logistic-regression feature-engineering catboost

asked Jan 16 '22 at 15:43

dsbr__0

191
1
3

3

votes

0 answers

What is the concept behind the categorical-encoding used in the CatBoost benchmark problems?

I'm working through CatBoost quality benchmark problems (here). I'm particularly intrigued by the methodology adopted to convert categorical features to numerical values as described in the comparison_description.pdf (here). What is the reasoning…

boosting categorical-encoding catboost

asked Sep 04 '20 at 03:51

PPR

171
1
5

2

votes

1 answer

Catboost multiclassification evaluation metric: Kappa & WKappa

I am working on an unbalanced classification problem and i want to use Kappa as my evaluation metric. Considering the classifier accepts weights (which i have given it), should i still be using weighted kappa or just use the standard kappa? I am not…

python multiclass-classification catboost

asked Oct 07 '20 at 12:39

Musa

31
2

2

votes

0 answers

Catboost: Categorcial Feature Encoding

I would like to understand all the methods available in Catboost for encoding categorical features. Unfortunately, the published articles by Yandex ("CatBoost: gradient boosting with categorical features support" and "CatBoost: unbiased boosting…

encoding catboost

asked Sep 15 '22 at 18:52

calpyte

121
2

2

votes

1 answer

How do we target-encode categorical features in multi class classification problems?

Say I have a multiclass problem with a dataset as this: user_id price target -------+--------+----- 1 30 apple 1 20 samsung 2 32 samsung 2 40 huawei . . where I have a lot of users i.e One Hot…

catboost target-encoding

asked Jul 25 '22 at 12:16

CutePoison

450
2
8

2

votes

1 answer

How to tell CatBoost which feature is categorical?

I am excited to learn that CatBoost can handle categorical features by itself. One of my features, Department ID, is categorical. However, it looks like numeric, since the values are like 1001, 1002, ..., 1218. Those numbers are just IDs of the…

categorical-data catboost

asked Mar 15 '22 at 04:52

Fred Chang

85
1
6

1

vote

1 answer

Does Gradient Boosting perform n-ary splits where n > 2?

I wonder whether algorithms such as GBM, XGBoost, CatBoost, and LightGBM perform more than two splits at a node in the decision trees? Can a node be split into 3 or more branches instead of merely binary splits? Can more than one feature be used in…

xgboost gbm lightgbm natural-gradient-boosting catboost

asked Dec 18 '20 at 15:03

Chong Lip Phang

221
2
8

1

vote

0 answers

Feature Selection before modeling with Boosting Trees

I have read in some papers that the subset of features chosen for a boosting tree algorithm will make a big difference on the performanceso I've been trying RFE, Boruta, Clustering variables, correlation, WOE & IV and Chi-square Let's say I have a…

r feature-selection xgboost lightgbm catboost

asked Oct 28 '20 at 19:28

Mamoud

11
2

1

vote

2 answers

Does gradient boosting algorithm error always decrease faster and lower on training data?

I am building another XGBoost model and I'm really trying not to overfit the data. I split my data into train and test set and fit the model with early stopping based on the test-set error which results in the following loss plot: I'd say this is…

xgboost overfitting boosting adaboost catboost

asked Aug 19 '20 at 18:28

Xaume

182
2
11

1

vote

2 answers

RandomizedSearchcv(n_iter=10) doesnt stop after training 10 models

I am using RandomizedSearchcv for hyperparameter optimization. When I run the model, it shows the scores for each model training. The problem is, it trains way more than 10 models when in fact I expect it to train just 10 models by specifying…

cross-validation grid-search catboost

asked Mar 31 '23 at 18:59

Mehmet Deniz

31
4

1

vote

0 answers

Optuina pruning during CrossValidation, does it make sense?

I'm currently trying to build a model using CatBoost. For my parameter tuning, I'm using optuna and cross-validation and pruning the trial checking on the intermediate cross-validation scores. Here there's a minimum example: def objective(trial): …

python hyperparameter-tuning catboost

asked Mar 20 '23 at 09:18

GiusWestsideDS

11
1

1

vote

1 answer

If I use Weight of Evidence to transform categorical variables, do I still need to inform their indexes to Catboost

I'm using Weight of Evidence (WOE) to encode my categorical features. Do I still need to inform Catboost that they are categorical features by using cat_features parameter?

feature-engineering categorical-data categorical-encoding catboost

asked Sep 12 '22 at 18:21

Jorge Amaral

131
2

1

vote

0 answers

Intuition behind catboost encoding techniques

Can anyone please help me in understanding the effect of various bucketing techniques used in CatBoost Algorithm for categorical features? Like there is border, buckets, binarized target mean, counter encoding techniques, I am not able to get proper…

machine-learning categorical-data categorical-encoding catboost intuition

asked Sep 08 '22 at 09:05

Mimansa Maheshwari

11
1

Questions tagged [catboost]