Highest Voted 'target-encoding' Questions - Data Science Stack Exchange

5

votes

1 answer

Target encoding with KFold cross-validation - how to transform test set?

Let's say I have a categorical feature (cat): import random import pandas as pd from sklearn.model_selection import train_test_split, StratifiedKFold random.seed(1234) y = random.choices([1, 0], weights=[0.2, 0.8], k=100) cat = random.choices(["A",…

asked Sep 05 '20 at 07:16

Xaume

182
2
11

2

votes

1 answer

Categories with the same mean in target encoding

While doing target encoding it can happen that two categories have the same target mean. This is bad because there will be no difference in the new feature in it and we will lose some information. Also, this is potentially harmful to the model,…

machine-learning categorical-data categorical-encoding target-encoding

asked Apr 05 '20 at 17:57

Carlos Mougan

6,011
2
15
45

2

votes

1 answer

Does it make sense to impute the target variable when there are a few missing target values in my dataset?

I'm relatively new to machine learning and just trying to have a solid understanding of the basics. I scraped a real estate/house prices dataset off a website, which I am in the process of cleaning before modelling. I noticed that out of 1,998…

data-imputation target-encoding

asked Mar 22 '23 at 10:23

Baker Hans

21
2

2

votes

1 answer

How do we target-encode categorical features in multi class classification problems?

Say I have a multiclass problem with a dataset as this: user_id price target -------+--------+----- 1 30 apple 1 20 samsung 2 32 samsung 2 40 huawei . . where I have a lot of users i.e One Hot…

catboost target-encoding

asked Jul 25 '22 at 12:16

CutePoison

450
2
8

2

votes

1 answer

Predict apartment prices with two sources of prices

I am asking for help with the following problem. There are two subsamples in the dataset - one where the target is real(valid), and the other where it is approximate (I do not know how it differs yet, on one sample the real price of an apartment,…

regression target-encoding

asked Sep 23 '21 at 17:16

glycine-addict

21
2

1

vote

1 answer

Interpreting decision tree results after target encoding

I am not sure how to interpret the results of my decision tree after I had used target encoding, could someone clarify? The example below doesn't need target encoding just for explanation of my confusion here. For instance I am trying to classify if…

decision-trees target-encoding

asked Sep 12 '20 at 02:41

bob2

13
2

1

vote

0 answers

Does it make sense to use target encoding together with tree-based models?

I'm working on a regression problem with a few high-cardinality categorical features (Forecasting different items with a single model). Someone suggested to use target-encoding (mean/median of the target of each item) together with xgboost. While I…

random-forest xgboost categorical-encoding target-encoding

asked May 17 '22 at 13:28

KJA

11
1

1

vote

1 answer

One-Hot-Encoding Target variable

I have a dataset that consists of 4 values in a target variable. I have performed Ordinal Encoding over that which worked for me but my question here's that if I apply one-hot encoding can I solve this problem?. As it would be 4 new columns that are…

machine-learning one-hot-encoding target-encoding

asked Nov 15 '21 at 10:40

Adnan Khan

11
2

0

votes

1 answer

target encoding with multiple columns

I'm attempting to do target encoding with multiple columns on a dataframe and I'm getting an error message I don't understand. Here is a fragment of the code. X['District Code Encoded'] = encoder.fit_transform(X['District Code'], y) X['Property id…

target-encoding

asked Aug 12 '21 at 09:04

Rupert

101
2

0

votes

1 answer

why should i do target encoding within cv loop?

i wish to use target encoding, using the category encoders sklearn library. I don't really understand why it is necessary to include this as a step in a sklearn pipeline WITHIN the cross validation loop? e.g. this example here does so Target…

cross-validation overfitting categorical-encoding target-encoding

asked Dec 07 '20 at 18:01

Maths12

496
5
14

0

votes

1 answer

Logistic Regression Multi-level Independent variables

im trying to study logistic regression, when i did the target variable with all features, i had the summary showing the p-values as usual, but one for the features has 60 level, another feature has 13 level, so how can i proceed with this kind of…

predictive-modeling data-cleaning logistic-regression feature-engineering target-encoding

asked Jun 09 '20 at 21:22

PureAnalytics

1
1

0

votes

0 answers

Using transformer to predict permutation of tokens based on encoded input

i am trying to implement a transformer based agent for a tile placing board game, and i was thinking about this kind of architecture: Src input is embedded board, with an empty block (for example 10x10 with a 3x3 block missing, so 3*3=9 tiles to be…

transformer target-encoding spatial-transformer

asked Apr 13 '23 at 14:47

mt-clemente

1

0

votes

0 answers

Do you need to normalize labels for models other than neural nets?

As mentioned here, normalizing the target variable often helps a neural network converge faster. Does it help in convergence, or is there otherwise a reason to use it, for any type of model other than neural nets?

machine-learning neural-network normalization target-encoding

asked Jan 05 '23 at 21:38

bdavidson

1

0

votes

0 answers

Multi-class classification model unable to return desired outcome

I have a scenario of multi-class dataset with around 10 distinct classes of target. There are 3 categorical features each with multiple labels. If we check the data, each unique combination of feature values will always return single class of…

random-forest xgboost multiclass-classification categorical-encoding target-encoding

asked Jan 04 '23 at 15:29

soumalya saha

1
2

0

votes

1 answer

Dealing with observation with arbitrary number of categories with arbitary number of values

Suppose to have a set of elements $X = \{x_1, x_2, ..., x_n\}$. Each element is characterised by a set of features. The features characterising a particular element $x_i$ can belong to one of $q$ different categories. Each different category $f_q$…

machine-learning categorical-encoding target-encoding

asked May 26 '22 at 16:28

King Powa

111
2

Questions tagged [target-encoding]