

Our following code is responsible for generating normalized or non-normalized confusion matrix. def plot_confusion_matrix(predicted_labels_list, y_test_list): cnf_matrix = confusion_matrix(y_test_list, predicted_labels_list) np.set_printoptions(precision=2) # Plot non-normalized confusion matrix plt.figure() generate_confusion_matrix(cnf_matrix, classes=class_names, title='Confusion matrix, without normalization') plt.show() # Plot normalized confusion matrix plt.figure() generate_confusion_matrix(cnf_matrix, classes=class_names, normalize= True, title='Normalized confusion matrix') plt.show()

In the following code, we will feed our generated actual_targets and predicted_targets arrays to the plot_confusion_matrix() function. def evaluate_model(data_x, data_y): k_fold = KFold(10, shuffle= True, random_state=1) predicted_targets = np.array() actual_targets = np.array() for train_ix, test_ix in k_fold.split(data_x): train_x, train_y, test_x, test_y = data_x, data_y, data_x, data_y # Fit the classifier classifier = svm.SVC().fit(train_x, train_y) # Predict the labels of the test set samples predicted_labels = classifier.predict(test_x) predicted_targets = np.append(predicted_targets, predicted_labels) actual_targets = np.append(actual_targets, test_y) return predicted_targets, actual_targets Here we will fit our dataset by a Support Vector Machine. Trick !!! In each fold, we will get the actual test labels ( test_y) and predicted labels ( predicted_labels) in each fold and append each of them to an array of actual_targets and predicted_targetsrespectively.

The following code will split our dataset into training and test folds and will evaluate our model performance 10 times. We will evaluate our model by K-fold cross-validation with 10 folds. We can retrieve the statistical information of the dataset by the following snippet. import itertools import matplotlib.pyplot as plt import numpy as np from sklearn import svm, datasets from trics import confusion_matrix from sklearn.model_selection import KFold # import IRIS dataset to play with iris = datasets.load_iris() data = iris.data target = iris.target class_names = iris.target_names class_names labels, counts = np.unique(target, return_counts= True) We will first load the dataset and investigate the properties. This is a simple dataset about three species of flower types as Iris Setosa, Iris Versicolour and Iris Virginica. We will use the IRIS dataset for our implementation. So, this blog mainly aims to generate a concatenated confusion matrix while using cross-validation.

However, when we are executing cross-validation (ex: Leave-one-out cross-validation, k-fold cross-validation) to generalize our dataset to independent data, we may need to generate an average accuracy or a confusion matrix. Considering that, the Confusion Matrix an evaluation metric where we can assess the performance of our model in machine learning. Here we will create a function that allows the user to pass in the cm object created by the caret package in order to produce the visual.After we pre-process the data and develop our model, then we need to evaluate the effectiveness of our model. You can just use the rect functionality in r to layout the confusion matrix.
