Sklearn Confusion Matrix in Machine Learning With Examples

Sklearn is an open-source Python module that is used for Machine Learning and data analysis. It contains many useful algorithms and evaluation matrices for Machine learning models including sklearn confusion matrix.

Sklearn confusion matrix is a matrix that is used to evaluate the performance of the classification model. A classification model is a model that is used to predict discrete output values. Sklearn confusion matrix contains actual values and predicted values and which helps us to understand how good the model is at predicting. In this article, we will learn how we can use sklearn confusion matrix to evaluate the performance of classification models. Moreover, we will also learn how to understand the confusion matrix and how it actually works.


How to understand the Confusion matrix in Machine learning?

As you already discussed, a confusion matrix in machine learning is a tool that is used to summarize the performance of the classification algorithms. It shows how well the predictions of the model were by comparing the actual values with the predicted values of the model.

As you know, in supervised Machine learning a model is trained on the training data that contains the actual output values. And then the model is tested using the testing data. The confusion matrix is actually used to compare the predicted values of the model on testing data to the actual output values of the testing data. Which helps us to understand how well our model is performing.

Another important feature of sklearn confusion matrix in machine learning is that it can be used to calculate the performance matrices for the classification algorithms. For example, we can use the confusion matrix to calculate accuracy, precision, recall, and f1-score which we will discuss in detail later in the upcoming sections. For now, let us understand the structure of the confusion matrix.

sklearn-confusion-matrix-in-machine-learning-structure-of-confusion-matrix

You might be confused by looking at the structure of the confusion matrix. Don’t worry we are going to explain everything now.

Basically, a confusion matrix is a matrix that shows a comparison of the model’s correctly predicted and incorrectly predicted values for each of the output classes. That means for a binary classification problem, the size of the confusion matrix will be 2×2 as shown above and for a classification problem, that has three output classes, the size of the confusion matrix will be 3×3, and so on.

What is a True Positive in the Confusion Matrix?

A True Positive value, represented as TP in a confusion matrix represents the number of the correctly true or positive class by the model. In other simple words, TP represents the correct classification of the model for the positive values.

Let us take an example to understand the concept of True positive values in the confusion matrix. Let us assume that we have binary classification data about dogs and cats and we want our model to classify the images of dogs and cats. In this case, we will take the images of cats as a positive class and images of dogs as a negative class. If we provide an image of a cat as testing data and the model predicted the output as a cat, then we will say it is TP values as our model has predicted the positive output class correctly,

sklearn-confusion-matrix-in-machine-learnig-tp-value

As you can see in the above figure, the actual value was a cat and the model also predicted the value to be a cat.

What is the True Negative in the Confusion Matrix?

A True Negative value, represented as TN in a confusion matrix represents the number of the correctly False or negative class by the model. In other simple words, TN represents the correct classification of the model for the negative values.

Let us take the same example to understand the concept of True Negative values in the confusion matrix. Let us assume that we have binary classification data about dogs and cats and we want our model to classify the images of dogs and cats. In this case, we will take the images of dogs as a negative class and the images of cats as a positive class. If we provide an image of a dog as testing data and the model also predicted the output as a dog, then we will say it is TF values as our model has predicted the negative output class correctly.

sklearn-confusion-matrix-in-machine-learning-tn-value

As you can see, the actual value and the predicted value, are both the same and they both belong to the negative class.

What is a False Positive in the Confusion Matrix?

A False Positive value, represented as FP in a confusion matrix represents the number of the incorrectly True or Positive class by the model. In other simple words, FP represents the incorrect classification of the model for the Negative values and predicted them to be a Positive class. It might seem to be confusing but lets us take the example of dogs and cats again to understand the concept.

In our case, cats were a positive class and dogs were a negative class. Let us say, provide the image of a dog as testing to the model but it predicts it to be a cat, then it belongs to the false positive value. In other words, the model identifies the False/negative value as True/positive which is why it is False Positive which means the actual value is not Positive but the model predicted it to be Positive.

sklearn-confusion-matrix-in-machine-learnig-fp-value

As you can see, the actual value/image was a dog and the model predicted it to be a cat.

What is a False Negative in the Confusion Matrix?

A False Negative value, represented as FN in a confusion matrix represents the number of the incorrectly False or Negative class by the model. In other simple words, FN represents the incorrect classification of the model for the Positive values and predicted them to be a Negative class.

Let us again take the same example of the dogs and cats images to understand the False Negative values in the confusion matrix. In our case, the dogs were from the negative class and the cats were from the positive class. If we use the cat image and testing data and the model predicted it as a dog, then it is False Negative which means the data do not actually belong to the negative class but the model has predicted it to be negative.

sklearn-confusion-matrix-in-machine-learning-fn-values

As you can see, the actual value was a cat and the model predicted it to be a dog.

Calculations using Confusion Matrix in Machine learning

A confusion matrix can be used to calculate various classification matrices to evaluate the performance of the model. For example, apart from giving a summary of the predictions of the model, the confusion matrix can be used to calculate the accuracy, precision, recall, and f1-score for the model. To understand these evaluation matrices, let us take a simple confusion matrix for binary classification, and then we will calculate the values of different evaluation matrices.

sklearn-confusion-matrix-in-machine-learning-sample-confusion-matri

In the above matrix:

  • True Positive values = 35
  • True Negative values = 44
  • False Positive values = 5
  • False Negative values = 3

Accuracy and Precision Using Sklearn Confusion Matrix

Accuracy is the metric used in machine learning to assess which model is most effective in spotting connections and patterns between variables in a dataset based on input or training data. Accuracy in classification problems enables us to evaluate how well our model categorizes various classes. We can find the accuracy of the model by using the following formula.

Accuracy = (TP + TN ) / ( TP + TN + FP + FN)

Let us see the accuracy formula in visualized formed as well to understand it in a better way.

sklearn-confusion-matrix-in-machine-learning-accuracy-matrix

Let us use the formula to find the accuracy of the sample confusion matrix given above.

Accuracy = ( 35 + 44) / ( 35 + 44 + 5 + 3 )

Accuracy = ( 79 ) / (87)

Accuracy = .908

Now, let us see how precision is different from accuracy because most people confuse both of them. It is the property of a successful model prediction. Precision can be defined as the total number of true positive predictions divided by the total number of true positives.

The following formula is used to find the precision of the model.

Precision = (TP ) / ( TP + FP )

Let us also visualize the precision formula so that you can easily differentiate between accuracy and precision.

sklearn-confusion-matrix-in-machine-learning-recall

Let us now calculate the precision using the above formula.

Precision = 35 / ( 35 + 5 )

Precision = 0.83

Recall and f1-score Using Sklearn Confusion Matrix

Now let us calculate the recall and f1-score from the confusion matrix as we did for the precision and accuracy score. The recall measures how well our model detects True Positives. The recall is calculated by dividing the total number of true predicted values by the total number of true predicted values and false negative values. It uses the following formula.

Recall = TP / ( TP + TN )

Let us also visualize the formula of recall as well.

sklearn-confusion-matrix-in-machine-learning-precision-matrix

Let us calculate the recall value as well.

Recall = 35 / 35+ 44

Recall = 0.44

The F1 Score, which includes both false positives and false negatives, is the weighted average of Precision and Recall. Although F1 is usually more beneficial than accuracy, especially if we have an unequal class distribution, it is not intuitively as simple to understand as accuracy. F1-score has the following formula.

F1-score = 2 *( ( precision * recall ) / ( precision + recall))

Let us calculate the f1-score as we already have precision and recall values.

f1-score = 2 * (( 0.83 * 0.44 ) / ( 0.83 + 0.44 ))

f1-score = 2 * (( 0.3652 ) / ( 1.27))

f1-score = 0.57

Sklearn Confusion Matrix Examples

Now we will implement the confusion matrix on the predicted values and the actual values. But before it, we have to use any of the classification models to train and then predict the output values. In this case, we will use the KNN algorithm for classification. The KNN algorithm is one of the simplest and most popular classification models in machine learning that can be used for both binary and multi-class classification problems. In our case, we will use the KNN algorithm for binary and multi-classification and then learn how we can use sklearn confusion matrix to evaluate the performance of the model.

Before going into the implementation part, make sure that you have installed the sklearn module on your systems.

Sklearn Confusion Matrix for Binary Classification

Binary classification is a classification problem where we have only two possible output classes. In this section, we will use the KNN algorithm on a binary dataset to train the model. We will use the dataset from the sklearn submodule ‘datasets’ which you can easily import.

We will not spend too much time on how the KNN algorithm works and how to train the model because we already covered it in the previous articles. In this article, our main focus is to understand how to apply sklearn confusion matrix to evaluate the model.

Let us import the required modules, train the model on the training dataset, and then use the testing data to make predictions.

# importing the required modules
from sklearn.datasets import load_breast_cancer
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split


# loading the data
cancer = load_breast_cancer()

# split dataset
X_train, X_test, Y_train, Y_test = train_test_split(cancer.data,cancer.target, test_size=0.25)

# initializing the knn model
knn = KNeighborsClassifier()

# training the knn modle
knn.fit(X_train, Y_train)

# making predictions on the testing dataset
y_pred = knn.predict(X_test)

As you can see, we first load the dataset and then split the dataset into testing and training parts. Next, we train the model on the training dataset and then use the testing dataset to make predictions.

Let us now use the confusion matrix to visualize how well the predictions were in this case.

# importing matplotlib
import matplotlib.pyplot as plt

# importing confusion matrix from sklearn 
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

# plotting confusion matrix
cm = confusion_matrix(Y_test,y_pred, labels=knn.classes_)

# ploting with labels
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=knn.classes_)
disp.plot()

# showing the matrix
plt.show()

Output:

sklearn-confusion-matrix-for-machine-learning-binary-matrix

The simplest way to understand a confusion matrix is that every value in the main diagonal represents the correct classification, while the rest all represents the incorrect classification. In our case, it shows that the model failed to classify 10 values while it correctly classify all other values.

Let us also calculate the classification report using sklearn module. The classification report contains accuracy score, precision, recall, and f1-score.

# finding the whole report
from sklearn.metrics import classification_report
print(classification_report(Y_test, y_pred))

Output:

sklearn-confusion-matrix-for-machine-learning-accuracy-report-binary

As you can see, the accuracy report has calculated values for all matrices.

Sklearn Confusion Matrix for Multiclassification

Now, we will use the sklearn confusion matrix for multiclassification. A multiclassification is a classification where we have more than two possible output classes. In this case, we will again use the KNN algorithm for multiclass classification.

Let us again load a multi-class dataset, train the model and then test the model using the testing dataset.

# importing the required modules
from sklearn import datasets
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split

# loading iris dataset
iris = datasets.load_iris()

# spliting dataset into testing and training parts. 
X_train, X_test, Y_train, Y_test = train_test_split(iris.data,iris.target, test_size=0.25)

# training the knn model on training parts
knn = KNeighborsClassifier()
knn.fit(X_train, Y_train)

# making predictions on the testing dataset
y_pred = knn.predict(X_test)

Now the next step is to visualize the confusion matrix in order to see the performance of the model.

# importing matplotlib
import matplotlib.pyplot as plt

# importing confusion matrix from sklearn 
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

# plotting confusion matrix
cm = confusion_matrix(Y_test,y_pred, labels=knn.classes_)

# ploting with labels
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=knn.classes_)
disp.plot()

# showing the matrix
plt.show()

Output:

sklearn-confusion-matrix-in-machine-learning-multi-confusion-matrix

As you can see, only one value was incorrectly classified by the model while the rest were classified correctly.

Summary

A confusion matrix in machine learning is a tool to evaluate the performance of the classification models. It visualizes the actual values and the predicted values in a form of a matrix will help to understand how the model is performing in each of the classes. The simplest way to understand a confusion matrix is that all the values in the main diagonal represent the correctly classified values while the rest all represent the incorrectly classified ones. In this article, we discussed the confusion matrix in detail by evaluating its structure. Moreover, we used sklearn module to find the confusion matrix of binary classification and multi-classification.

Scroll to Top