Classification and Regression Evaluation Metrics 

Recently, I have  written articles on  classification and regression evaluation metrics . You can refer the below links to check my original articles:

Classification and Regression Evaluation Metrics — Part 1

Classification and Regression Evaluation Metrics — Part 2

I have combined the Part1 and Part2 articles and presented here

Part1 talks about classification evaluation metrics.

We need to evaluate our machine learning algorithms with the help of various metrics. There are some commonly used metrics for regression and classification problems. We will see cover some of these evaluation error metrics.

The best way to analyse any key concept or problem in machine learning is to code & implement and analyse the results. I have written the below classification example in Python. We will analyse the results and along with it go through the key concepts.

"""
Classification Metrics
Author: Balamurali M
"""

import numpy as np
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.metrics import cohen_kappa_score, classification_report
from sklearn import svm

import warnings
warnings.filterwarnings('ignore')

#Generating matrix with random explanatory and response variables
matr = np.random.randint(2, size=(100, 20))
print (matr.shape)

train_exp = matr[:80, :19]
train_res = matr[:80, 19:]
test_exp = matr[80:, :19]
test_act = matr[80:, 19:]

class SVM1:
    def __init__(self, w1, x1, y1, z1):
        self.w1 = w1
        self.x1 = x1
        self.y1 = y1
        self.z1 = z1 
     
    def SVM_fit(self):
        a1 = svm.SVC()
        return a1.fit(self.w1, self.x1)
        
matr_exp = SVM1(train_exp, train_res, test_exp, test_act)

fit1 = matr_exp.SVM_fit()
predicted1 = fit1.predict(test_exp)
print ('Actual class')
print (test_act)

print ('Predicted class')
print (predicted1)

conf_1 = confusion_matrix(test_act, predicted1)  #confusion Matrix
print (conf_1)    

tneg, fpos, fneg, tpos = confusion_matrix(test_act, predicted1).ravel()
print(tneg, fpos, fneg, tpos) #true negative, false positive, false negative, true positive

acc_1 = accuracy_score(test_act, predicted1)
print (acc_1) #accuracy score

Link to the above code.

To summarize this code:

  1. Generate a random matrix with 100 rows and 20 columns with values of either 0 or 1. First 19 columns will be the explanatory variables and the 20th column will be the response variable
  2. Split the matrix into training and testing data sets. First 80 rows the training and the last 20 rows for testing
  3. Perform the classification with the support vector machines
  4. Use the confusion matrix and accuracy score to evaluate the results. (I have used the sklearn metrics library from the scikit-learn.)

Since the matrix we use is a random generated one, for every program run, the matrix values and the results will change. The data we analyse here will be for a specific run.

I ran the code and got the below results.

  1. Actual Class values: [1 0 0 0 1 1 1 1 1 0 0 0 1 1 0 1 1 0 1 0]
  2. Predicted Class values: [1 0 0 0 0 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1]
  3. Confusion Matrix:

[[6 3]
[5 6]]

4. True Negative, False Positive, False Negative, True Positive-6, 3, 5, 6 respectively

5. Accuracy Score: 0.6

We will now try to understand some key concepts and interpret the above results.

a) True Negatives are the rejections correctly classified as negative.

In our example, the second, third, fourth, tenth, eleventh, twelfth, fifteenth, eighteenth and twentieth elements are actually zero. Out of these the second, third, fourth, eleventh, fifteenth, eighteenth are correctly predicted as zeros. There are 6 true negatives.

b) False Positives are the incorrectly classified positives

In our example, the tenth, twelfth and twentieth elements are actually zero but predicted as ones. There are 3 false positives. False Positive is Type I error.

c) False Negatives are the incorrectly classified negatives

In our example, the fifth, seventh, eighth, ninth, fourteenth elements are actually one but predicted as zeros. There are 5 false negatives. False Negative is Type II error.

d) True Positives are correctly classified positives

In our example, the first, sixth, thirteenth, sixteenth, seventeenth and nineteenth elements are actually one and predicted as one. There are 6 True Positives.

The Confusion Matrix is a matrix where each row represents the actual class instances while each column represents the predicted class instances (or vice versa)

The Results we got earlier was:

[[6 3]
[5 6]]

We will put these values in the Confusion Matrix as shown below.

Confusion-matrix

Accuracy is calculated as (TP + TN)/(TP + TN + FP + FN)

In our example, this will be (6+6)/(6+6+5+3) = 0.6

This is exactly the result we got earlier.

You will also hear the terms sensitivity and specificity very frequently.

Sensitivity or True Positive Rate : TP/(TP + FN). In our example 6/(6+5) = 0.55. This is the proportion of the actual positives that are correctly identified as such.

Specificity or True Negative Rate: TN/(TN + FP). In our example 6/(6+3) = 0.67. This is the proportion of the actual negatives that are correctly identified as such.

Part 2 talks about regression evaluation metrics

We will take a look at two regression evaluation metrics — MAE (Mean Absolute Error) and MSE (Mean Squared Error). I have coded the below regression example in Python

"""
Regression Metrics
Author: Balamurali M
"""
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error

#Generating matrix with explanatory and response variable
matr = np.random.randint(10, size=(10, 5))

train_exp = matr[:8, :4]
train_res = matr[:8, 4:]
test_exp = matr[8:, :4]
test_act = matr[8:, 4:]

class MLR:
    def __init__(self, w1, x1, y1, z1):
        self.w1 = w1
        self.x1 = x1
        self.y1 = y1
        self.z1 = z1     

    def fit_pred(self):
        LR = LinearRegression()
        LR.fit(self.w1, self.x1)
        return LR.predict(self.y1)
        
matr_exp = MLR(train_exp, train_res, test_exp, test_act)
predicted = matr_exp.fit_pred()
print ('Actual Value')
print (test_act)

print ('Predicted Value')
print (predicted)

mae = mean_absolute_error(test_act, predicted) #Mean Absolute Error
print (mae)        

mse = mean_squared_error(test_act, predicted) #Mean Square Error
print (mse)

Link to the above code

The code has the following parts:

  1. Generate a random matrix with 10 rows and 5 columns. First 4 columns will be independent variable x1,x2,x3,x4 and the 5th column will represent the y or response variable.
  2. Split the matrix into training and testing data sets. First 8 rows are for the training set and the last 2 rows are for testing set.
  3. Perform multiple linear regression and predict the results using test features (test_exp)
  4. Use the Mean Absolute Error and Mean Square Error to evaluate results.

Again like in the Part1 classification example, since the matrix we use is a random generated one, for every program run, the matrix values and the results will change. The data we analyse here will be for a specific run.

After executing the code, I got the below results :

Actual Values (Response y values (referred as test_act in the program) in the Test Data set): 1, 9

Predicted Values (Predicted values (referred as predicted in the program) using Multiple Linear Regression) : 0.399, 6.84

Mean Absolute Error(MAE): 1.377

Mean Squared Error(MSE) : 2.499

  1. Mean Absolute Error :

The MAE is the average of sum of all the absolute value of errors, where error is the difference between actual and predicted values. We have 2 set of actual and predicted values. Calculation : 1/2 x (|1–0.399|+|9–6.84|) = 1.38

2. Mean Squared Error :

The MSE is the average of sum of all the squares of errors.

Calculation: 1/2 x ((1–0.399)² + (9–6.84)²) = 2.5

Hope this article was helpful to you. Please post your comments and connect with me on Twitter and LinkedIn. Thank you.

 

Leave a comment