Questions tagged [confusion-matrix]

A confusion matrix is a special contingency table used to evaluate the predictive accuracy of a classifier. Predicted classes are listed in rows and actual classes in columns, with counts of respective cases in each cell.

A confusion matrix is a special contingency table used to evaluate the predictive accuracy of a classifier. Predicted classes are listed in rows (or columns) & actual classes in columns (or rows), with counts of the cases in each combination listed in each cell. All cases represented along the main diagonal are accurately classified, while the off-diagonal elements are misclassified. Inspection of the confusion matrix can identify which classes tend to be 'confused' for each other. The confusion matrix also allows the calculation of model performance metrics such as sensitivity and specificity, precision and recall, positive and negative predictive value, etc.

Here is an example confusion matrix from a model of Fisher's iris data with good accuracy. The model occasionally confuses versicolor and virginica, but never misclassified either as setosa.

                      Actual:
             setosa versicolor virginica 
Predicted:
  setosa         50          0         0  
  versicolor      0         47         3  
  virginica       0          4        46 
174 questions
21
votes
3 answers

How to get predictions with predict_generator on streaming test data in Keras?

In the Keras blog on training convnets from scratch, the code shows only the network running on training and validation data. What about test data? Is the validation data the same as test data (I think not). If there was a separate test folder on…
pseudomonas
  • 1,032
  • 3
  • 13
  • 30
16
votes
1 answer

Train Accuracy vs Test Accuracy vs Confusion matrix

After I developed my predictive model using Random Forest I get the following metrics: Train Accuracy :: 0.9764634601043997 Test Accuracy :: 0.7933284397683713 Confusion matrix [[28292 1474] …
14
votes
3 answers

How can I make big confusion matrices easier to read?

I have recently published a dataset (link) with 369 classes. I ran a couple of experiments on them to get a feeling for how difficult the classification task is. Usually, I like it if there are confusion matrices to see the type of error being made.…
Martin Thoma
  • 18,630
  • 31
  • 92
  • 167
12
votes
4 answers

Can the F1 score be equal to zero?

As it is mentioned in the F1 score Wikipedia, 'F1 score reaches its best value at 1 (perfect precision and recall) and worst at 0'. What is the worst condition that was mentioned? Even if we consider the case of: either precision or recall is…
11
votes
3 answers

Inverse Relationship Between Precision and Recall

I made some search to learn precision and recall and I saw some graphs represents inverse relationship between precision and recall and I started to think about it to clarify subject. I wonder the inverse relationship always hold? Suppose I have a…
tkarahan
  • 422
  • 5
  • 14
10
votes
2 answers

How to get an aggregate confusion matrix from n different classifications

I want to test the accuracy of a methodology. I ran it ~400 times, and I got a different classification for each run. I also have the ground truth, i.e., the real classification to test against. For each classification I computed a confusion matrix.…
gc5
  • 879
  • 2
  • 9
  • 17
8
votes
9 answers

Confusion Matrix - Get Items FP/FN/TP/TN - Python

After run my python code: print(confusion_matrix(x_test, x_pred)) I get this: [100 32 211 21] My question is how can I get the following list: True positive = 100 False positive = 32 False negative = 211 True negative = 21 Is this possible?
John_Rodgers
  • 147
  • 1
  • 1
  • 7
6
votes
3 answers

Accuracy is lower than f1-score for imbalanced data

For a binary classification, I have a dataset with 55% negative label and 45% positive labels. The results of the classifier shows that the accuracy is lower than the f1-score. Does that mean that the model is learning the negative instances much…
6
votes
4 answers

Confusion matrix logic

Can someone explain me the logic behind the confusion matrix? True Positive (TP): prediction is POSITIVE, actual outcome is POSITIVE, result is 'True Positive' - No questions. False Negative (FN): prediction is NEGATIVE, actual outcome is POSITIVE,…
Tauno
  • 739
  • 2
  • 9
  • 8
6
votes
1 answer

Kappa From Combined Confusion Matrices

I am trying to evaluate and compare several different machine learning models built with different parameters (i.e. downsampling, outlier removal) and different classifiers (i.e. Bayes Net, SVM, Decision Tree). I am performing a type of cross…
5
votes
1 answer

Suitable metric choice for imbalanced multi-class dataset (classes have equal importance)

What type of metrics I should use to evaluate my classification models, given that I have two imbalanced multi-class datasets (21 and 16 classes, respectively) where all classes have equal importance? I am somehow convinced with macro-averaged-based…
5
votes
1 answer

Confusion Matrix three classes python

I want to calculate: True_Positive, False_Positive, False_Negative, True_Negative for three categories. I used to have two classes Cat Dog and this is the way I used to calculate my confusion_matrix y_pred has either a cat or dog y_true has either…
FUN_
  • 51
  • 1
  • 1
  • 2
5
votes
3 answers

Precision and Recall if not binary

I have to calculate precision and recall for a university project to measure the quality of the classification output (with sklearn). Say this would be my results: y_true = [0, 1, 2, 1, 1] y_pred = [0, 2, 1, 2, 1] confusion matrix: [1 0 0] [0 1…
solaire
  • 153
  • 1
  • 5
5
votes
1 answer

Dealing with unbalanced error rate in confusion matrix

Here is the confusion matrix I got when I was playing with Forest Type Cover Kaggle dataset : Link. In the matrix, light color and higher numbers represent higher error rates, so as you can see, lots of mis-classification happened between class 1…
4
votes
2 answers

What is done first, cross validation or grid search?

When I have the data set to train a model with SVM, which procedure is performed first, cross validation or grid search? I have read this in a couple of books but I don't know in what order all this should be done. If cross-validation is first…
SRG
  • 43
  • 4
1
2 3
11 12