Questions tagged [binary]
100 questions
59
votes
9 answers
How to deal with version control of large amounts of (binary) data
I am a PhD student of Geophysics and work with large amounts of image data (hundreds of GB, tens of thousands of files). I know svn and git fairly well and come to value a project history, combined with the ability to easily work together and have…
Johann
- 701
- 1
- 5
- 5
38
votes
5 answers
Best practices to store Python machine learning models
What are the best practices to save, store, and share machine learning models?
In Python, we generally store the binary representation of the model, using pickle or joblib. Models, in my case, can be ~100Mo large. Also, joblib can save one model to…
Antoine Dusséaux
- 481
- 1
- 4
- 7
30
votes
2 answers
How to interpret classification report of scikit-learn?
As you can see, it is about a binary classification with linearSVC. The class 1 has a higher precision than class 0 (+7%), but class 0 has a higher recall than class 1 (+11%). How would you interpret this?
And two other questions: what does…
user77241
20
votes
5 answers
Choose binary classification algorithm
I have a binary classification problem:
Approximately 1000 samples in training set
10 attributes, including binary, numeric and categorical
Which algorithm is the best choice for this type of problem?
By default I'm going to start with SVM…
IgorS
- 5,444
- 11
- 31
- 43
10
votes
4 answers
Why might several types of models give almost identical results?
I've been analyzing a data set of ~400k records and 9 variables The dependent variable is binary. I've fitted a logistic regression, a regression tree, a random forest, and a gradient boosted tree. All of them give virtual identical goodness of fit…
JenSCDC
- 317
- 1
- 10
9
votes
3 answers
Binary (Unary) Recommendation System with Biased Views
I would like to create a content recommendation system based on binary click data that also takes views into account.
What content a user has been exposed to, and therefore has the chance to click on, is currently biased by a rule based system that…
elz
- 43
- 8
9
votes
1 answer
Using SVM as a binary classifier, is the label for a data point chosen by consensus?
I'm learning Support Vector Machines, and I'm unable to understand how a class label is chosen for a data point in a binary classifier. Is it chosen by consensus with respect to the classification in each dimension of the separating hyperplane?
gc5
- 879
- 2
- 9
- 17
8
votes
1 answer
Micro-F1 and Macro-F1 are equal in binary classification and I don't know why
I have a binary classification problem which in the test set, the number of data in both classes are equal (the test number of class 0 and class 1 are equal). Since we know that the number of samples from every class are equal, I use median on the…
user137927
- 379
- 1
- 3
- 10
8
votes
2 answers
Why are precision and recall used in the F1 score, rather than precision and NPV?
In binary classification problems it seems the F1 score is often used as a performance measure. As far as I've understood the idea is to find the best tradeoff between precision and recall. The formula for the F1 score is symmetric in precision and…
egdvnyjklu
- 181
- 2
6
votes
3 answers
What is the best metric to evaluate highly imbalanaced binary classifiction? (such as fraud detection in credit card)
What is the best metric to evaluate highly imbalanaced binary classifiction? (such as fraud detection in credit card?
I have examining several metrics precision recall F1 lassification Report (macro avg,weighted avg), ROC, AUC,.. but I do not know…
user10296606
- 1,784
- 5
- 17
- 31
6
votes
2 answers
Is a correlation matrix meaningful for a binary classification task?
When examining my dataset with a binary target (y) variable I wonder if a correlation matrix is useful to determine predictive power of each variable.
My predictors (X) contain some numeric and some factor variables.
Georg Heiler
- 327
- 2
- 3
- 13
5
votes
1 answer
Reduce multiclass classification targets to binary classification targets in scikit-learn
I would like to reduce multiclass classification targets to binary classification targets. Ideally, this mapping would happen within scikit-learn so the same transformation applies during both training and prediction.
I looked at transforming the…
Brian Spiering
- 20,142
- 2
- 25
- 102
5
votes
1 answer
how can I generate a Bernoulli block mixture model in matlab?
I am trying to write the code of a Bernoulli block mixture model in matlab, but am facing an error every time I run the function. In particular, I'm having a problem with how to relate the distribution parameter $\alpha$ to the latent variables $Z$…
Ahmad Tay
- 51
- 2
4
votes
2 answers
If in t-SNE digaram of binary classification both classes follow the similar curve what does t-SNE diagram show?
If in t-SNE digaram of binary classification both classes follow the similar curve what does t-SNE diagram show for instance: Figure1 or Figure2
user10296606
- 1,784
- 5
- 17
- 31
4
votes
3 answers
How to create an ensemble that gives precedence to a specific classifier
Suppose that in a binary classification task, I have separate classifiers A, B, and C. If I use A alone, I will get a high precision, but low recall. In other words, the number of true positives are very high, but it also incorrectly tags the rest…
Clement Attlee
- 141
- 1