Questions tagged [binary-classification]
126 questions
5
votes
1 answer
How to choose the right threshold for binary classification?
I am currently working on the titanic dataset from Kaggle. The data set is imbalanced with almost 61.5 % negative and 38.5 positive class.
I divided my training dataset into 85% train and 15% validation set. I chose a support vector classifier as…
Joe
- 75
- 1
- 5
4
votes
3 answers
Timing of applying random oversampling on the dataset
I tried to learn classification using machine learning algorithms. I went through Breast Cancer - EDA, Balancing and ML the notebook. In this notebook Random Oversampling had been implemented. However, when the person did the oversampling he did it…
Encipher
- 359
- 1
- 9
4
votes
2 answers
Meaningfully compare target vs observed TPR & FPR
Suppose I have a binary classifier $f$ which acts on an input $x$. Given a threshold $t$, the predicted binary output is defined as:
$$
\widehat{y} = \begin{cases}
1, & f(x) \geq t \\
0, & f(x) < t
\end{cases}
$$
I then compute the $TPR$…
Alexandru Dinu
- 183
- 5
3
votes
1 answer
What does precision-recall curve and ROC curve tell us abouth threshold invariance
Consider a binary classification problem.
Intuitively, a value for the area under the curve (for both curves) very close to 1, shows that the curve is almost L-shaped.
Thus, this means that the value on y axis stays rather consistent despite changes…
liakoyras
- 626
- 4
- 15
3
votes
1 answer
How to combine binary classification with patient stratification?
I am working on a binary classification model (healthy/diseased) based on gene expression data of different patients. As a second task, I would like to stratify these patients and find subgroups.
I expect that the summary pattern of different genes…
vhio
- 31
- 2
3
votes
3 answers
How are scores calculated for each class of binary classification
The formula for Precision is TP / TP + FP, but how to apply it individually for each class of a binary classification problem,
For example here the precision, recall and f1 scores are calculated for class 0 and class 1 individually, I am not able…
Jainam Shroff
- 45
- 4
3
votes
1 answer
How do you add negative class sample for binary classification?
How do you prepare the negative dataset for binary classification? Let us say that I am building a classifier that has to classify whether the input image is of a car or not. I already have a dataset that consists of thousands of cars. But what…
imtiaz ul Hassan
- 231
- 1
- 8
2
votes
1 answer
Changing model architecture doesn't impact results
I am currently learning binary classification.
The problem is classifying positive and negative movie reviews.
The dataset is 25,000 reviews with each review represented by 10,000 of the most used words. each review is transformed into multi-hot…
Omer Mualem
- 23
- 2
2
votes
1 answer
Finding research papers for a dataset
I found a breast cancer dataset on Kaggle. Here is the link - https://www.kaggle.com/datasets/reihanenamdari/breast-cancer
I would like to how could I find out which research papers use this dataset for binary classification.
So far I got only one…
Encipher
- 359
- 1
- 9
2
votes
2 answers
Binary Classification with Very Small Dataset (<40 samples)
I'm trying to perform binary classification on a very small dataset, consisting of 3 negative samples and 36 positive samples. I've been testing different models from scikit-learn (logistic regression, random forest, svc, mlp). Depending on…
apcuevw
- 23
- 3
2
votes
0 answers
Obtaining threshold based rules for classification problem
Suppose there are X1...Xn numerical variables predicting a target variable Y (0 or 1)
Objective: to obtain the best possible thresholds and combinations of X1...Xn that can predict Y
Example: (X1>60 and X3<20) predicts Y=1 with 90%…
Sunit Gautam
- 121
- 3
2
votes
2 answers
Is it vital to do label encoding with target variable
Should I always use label encoding while doing binary classification?
Rus Pylypyuk
- 21
- 1
2
votes
2 answers
Which machine learning algorithms are more suitable for binary classification?
We know that there are many different types of classification algorithms. But among the different categories of classification algorithms, which algorithms are suitable for binary classification and which are suitable for more classes, and why?
AMZ
- 143
- 3
2
votes
3 answers
What could go wrong if I sample before classification?
I have a million entries in a table that I can use to train a binary classifier. Only 30 thousand of them are positive. Is there anything fundamentally wrong with selecting around 30 thousand negative cases uniformly and then training a binary…
Bruce
- 186
- 1
- 8
2
votes
1 answer
Top 2% of scores of a binary classifier are 100% class 1
I have a binary classification model (Xgboost) that is supposed to be predicting whether a customer will be purchasing a service.
Overall the metrics are satisfactory ~.67 AUC, ~30% precision and ~40% recall @ max F1, performance holds well out of…
Mouad_S
- 121
- 4