Questions tagged [multilabel-classification]

Multilabel classification assigns to each sample a set of target labels. This can be thought as predicting properties of a data-point that are not mutually exclusive, such as topics that are relevant for a document. A text might be about any of religion, politics, finance or education at the same time or none of these.

314 questions
50
votes
3 answers

Understanding predict_proba from MultiOutputClassifier

I'm following this example on the scikit-learn website to perform a multioutput classification with a Random Forest model. from sklearn.datasets import make_classification from sklearn.multioutput import MultiOutputClassifier from sklearn.ensemble…
Harpal
  • 903
  • 1
  • 7
  • 13
21
votes
1 answer

What does it mean to "share parameters between features and classes"

When reading this paper there is a line which says "linear classifiers do not share parameters among features and classes." What is the meaning of this statement? Does it mean that linear classifiers such as logistic regression need features…
12
votes
2 answers

Deep Learning with Spectrograms for sound recognition

I was looking into the possibility to classify sound (for example sounds of animals) using spectrograms. The idea is to use a deep convolutional neural networks to recognize segments in the spectrogram and output one (or many) class labels. This is…
11
votes
6 answers

How to use sklearn train_test_split to stratify data for multi-label classification?

I am attempting to mirror a machine learning program by Ahmed Besbes, but scaled up for multi-label classification. It seems that any attempt to stratify the data returns the following error: The least populated class in y has only 1 member, which…
9
votes
7 answers

Python library that can compute the confusion matrix for multi-label classification

I'm looking for a Python library that can compute the confusion matrix for multi-label classification. FYI: scikit-learn doesn't support multi-label for confusion matrix) What is the difference between Multiclass and Multilabel Problem
8
votes
2 answers

Which classification algorithms are negatively affected by class imbalances?

I've seen a few posts and papers floating around the web (mostly those related to over/undersampling, SMOTE, and cost-sensitive training) that, when discussing class imbalance, specify that certain algorithms are negatively impacted by class…
8
votes
1 answer

Dealing with extreme values in softmax cross entropy?

I am dealing with numerical overflows and underflows with softmax and cross entropy function for multi-class classification using neural networks. Given logits, we can subtract the maximum logit for dealing with overflow but if the values of the…
RE60K
  • 183
  • 1
  • 4
8
votes
3 answers

Where can I find freely available multi-label datasets online?

I'm trying to find multi-label classfication datasets, which are available for free online. By "multi-label" I mean that each instance can be labeled with anywhere from a single to $k$ labels, where $k$ is the total number of different labels in…
Bobson Dugnutt
  • 185
  • 1
  • 8
8
votes
1 answer

Naive Bayes for Multi label text classification

How to use Naive Bayes for multi-label text classification in R. I tried using naiveBayes() from e1071 library but it seems that while training, it doesn't accept multi-label class variable. I created TermDocumentMatrix using the text document…
7
votes
2 answers

AUC-ROC for Multi-Label Classification

Hey guys I'm currently reading about AUC-ROC and I have understood the binary case and I think that I understand the multi-classification case. Now I'm a bit confused on how to generalize it to the multi-label case, and I can't find any intuitive…
6
votes
3 answers

How mean and deviation come out with MNIST dataset?

I am a novice at the data science, and I notice some repository state the mean value and deviation in MNIST dataset are 0.1307 and 0.3081. I cannot imagine how these two numbers come from. Based on my understanding, the MNIST dataset has 60,000 pics…
rj487
  • 195
  • 2
  • 5
5
votes
2 answers

SMOTE for multilabel classification

I have a dataset with 77 different labels. Each sample has one or more of these labels. I did some data analysis and found out that the dataset is highly imbalanced - there are a large number of examples that have a particular label, whereas the…
5
votes
3 answers

Multi-label classification based on single-label dataset

I'm looking for a solution to detect different moods/styles expressed by an image. Unfortunately, there is no multi-labeled dataset for this task. The scenario of defining a multi-label classification model based on single labeled data doesn't seem…
5
votes
2 answers

How to visualize results/errors of multilabel classifiers?

For multiclass classification you would normally choose a confusion matrix to plot the error of predicted classes against the target classes. What is the best way to visualize errors of multilabel classifiers? As multiple classes are predicted at…
5
votes
1 answer

What is the best way to deal with imbalanced data for XGBoost?

There are a lot of way to deal with class-imbalanced data like undersampling, oversampling, changing cost function etc. https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/ Here is the post…
1
2 3
20 21