Questions tagged [classification]

An instance of supervised learning that identifies the category or categories which a new instance of dataset belongs.

In machine learning and statistics, classification refers to the problem of predicting category memberships based on a set of pre-labeled examples. It is thus a type of supervised learning.

Some of the most important classification algorithms are support vector machines svm, logistic regression, naive Bayes, random forest random-forest and artificial neural networks neural-network.

When we wish to associate inputs with continuous values in a supervised framework, the problem is instead known as regression. The unsupervised counterpart to classification is known as clustering (or cluster analysis), and involves grouping data into categories based on some measure of inherent similarity.

3226 questions

256

votes

10 answers

How to set class weights for imbalanced classes in Keras?

I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to provide one? By the way, in this case the appropriate praxis is simply to weight up the…

deep-learning classification keras weighted-data

asked Aug 17 '16 at 09:35

Hendrik

8,377
17
40
55

votes

6 answers

Cosine similarity versus dot product as distance metrics

It looks like the cosine similarity of two features is just their dot product scaled by the product of their magnitudes. When does cosine similarity make a better distance metric than the dot product? I.e. do the dot product and cosine similarity…

classification

asked Jul 15 '14 at 21:30

ahoffer

votes

5 answers

How to get accuracy, F1, precision and recall, for a keras model?

I want to compute the precision, recall and F1-score for my binary KerasClassifier model, but don't find any solution. Here's my actual code: # Split dataset in train and test data X_train, X_test, Y_train, Y_test = train_test_split(normalized_X,…

machine-learning neural-network deep-learning classification keras

asked Feb 06 '19 at 13:29

ZelelB

1,027
2
10
14

votes

7 answers

Deep Learning vs gradient boosting: When to use what?

I have a big data problem with a large dataset (take for example 50 million rows and 200 columns). The dataset consists of about 100 numerical columns and 100 categorical columns and a response column that represents a binary class problem. The…

machine-learning classification deep-learning

asked Nov 20 '14 at 06:49

Nitesh

1,615
1
12
22

votes

4 answers

Early stopping on validation loss or on accuracy?

I am currently training a neural network and I cannot decide which to use to implement my Early Stopping criteria: validation loss or a metrics like accuracy/f1score/auc/whatever calculated on the validation set. In my research, I came upon articles…

machine-learning neural-network deep-learning classification

asked Aug 20 '18 at 12:22

qmeeus

1,239
1
10
13

votes

6 answers

When would one use Manhattan distance as opposed to Euclidean distance?

I am trying to look for a good argument on why one would use the Manhattan distance over the Euclidean distance in machine learning. The closest thing I found to a good argument so far is on this MIT lecture. At 36:15 you can see on the slides the…

machine-learning classification distance

asked Jun 30 '17 at 06:28

Bitcoin Cash - ADA enthusiast

votes

6 answers

Unbalanced multiclass data with XGBoost

I have 3 classes with this distribution: Class 0: 0.1169 Class 1: 0.7668 Class 2: 0.1163 And I am using xgboost for classification. I know that there is a parameter called scale_pos_weight. But how is it handled for 'multiclass' case, and how can…

classification xgboost multiclass-classification class-imbalance

asked Jan 16 '17 at 12:53

shda

votes

5 answers

When to use Random Forest over SVM and vice versa?

When would one use Random Forest over SVM and vice versa? I understand that cross-validation and model comparison is an important aspect of choosing a model, but here I would like to learn more about rules of thumb and heuristics of the two…

machine-learning classification random-forest svm

asked Aug 20 '15 at 04:16

Rohit

votes

5 answers

Are decision tree algorithms linear or nonlinear

Recently a friend of mine was asked whether decision tree algorithms are linear or nonlinear algorithms in an interview. I tried to look for answers to this question but couldn't find any satisfactory explanation. Can anyone answer and explain the…

machine-learning classification decision-trees algorithms pac-learning

asked Aug 13 '15 at 13:59

user2966197

votes

4 answers

Quick guide into training highly imbalanced data sets

I have a classification problem with approximately 1000 positive and 10000 negative samples in training set. So this data set is quite unbalanced. Plain random forest is just trying to mark all test samples as a majority class. Some good answers…

machine-learning classification dataset class-imbalance

asked Sep 12 '14 at 15:20

IgorS

5,444
11
31
43

votes

1 answer

What is the best Keras model for multi-class classification?

I am working on research, where need to classify one of three event WINNER=(win, draw, lose) WINNER LEAGUE HOME AWAY MATCH_HOME MATCH_DRAW MATCH_AWAY MATCH_U2_50 MATCH_O2_50 3 13 550 571 1.86 3.34 …

python neural-network classification clustering keras

asked Feb 01 '16 at 15:18

SpanishBoy

votes

4 answers

What algorithms should I use to perform job classification based on resume data?

Note that I am doing everything in R. The problem goes as follow: Basically, I have a list of resumes (CVs). Some candidates will have work experience before and some don't. The goal here is to: based on the text on their CVs, I want to classify…

machine-learning classification nlp text-mining

asked Jul 03 '14 at 16:11

user1769197

votes

3 answers

What is difference between text classification and topic models?

I know the difference between clustering and classification in machine learning, but I don't understand the difference between text classification and topic modeling for documents. Can I use topic modeling over documents to identify a topic? Can I…

classification text-mining topic-model

asked Aug 12 '14 at 03:50

Ali

votes

2 answers

How to interpret classification report of scikit-learn?

As you can see, it is about a binary classification with linearSVC. The class 1 has a higher precision than class 0 (+7%), but class 0 has a higher recall than class 1 (+11%). How would you interpret this? And two other questions: what does…

classification metric binary

asked Dec 08 '19 at 23:17

user77241

votes

6 answers

What is the reason behind taking log transformation of few continuous variables?

I have been doing a classification problem and I have read many people's code and tutorials. One thing I've noticed is that many people take np.log or log of continuous variable like loan_amount or applicant_income etc. I just want to understand…

machine-learning python classification scikit-learn

asked Oct 23 '18 at 13:08

Sai Kumar

2 3

…

99 100 Next