Recently, I have written articles on classification and regression evaluation metrics . You can refer the below links to check my original articles:

Classification and Regression Evaluation Metrics — Part 1

Classification and Regression Evaluation Metrics — Part 2

I have combined the Part1 and Part2 articles and presented here

**Part1 talks about classification evaluation metrics.**

We need to evaluate our machine learning algorithms with the help of various metrics. There are some commonly used metrics for regression and classification problems. We will see cover some of these evaluation error metrics.

The best way to analyse any key concept or problem in machine learning is to code & implement and analyse the results. I have written the below classification example in Python. We will analyse the results and along with it go through the key concepts.

""" Classification Metrics Author: Balamurali M """ import numpy as np from sklearn.metrics import confusion_matrix, accuracy_score from sklearn.metrics import cohen_kappa_score, classification_report from sklearn import svm import warnings warnings.filterwarnings('ignore') #Generating matrix with random explanatory and response variables matr = np.random.randint(2, size=(100, 20)) print (matr.shape) train_exp = matr[:80, :19] train_res = matr[:80, 19:] test_exp = matr[80:, :19] test_act = matr[80:, 19:] class SVM1: def __init__(self, w1, x1, y1, z1): self.w1 = w1 self.x1 = x1 self.y1 = y1 self.z1 = z1 def SVM_fit(self): a1 = svm.SVC() return a1.fit(self.w1, self.x1) matr_exp = SVM1(train_exp, train_res, test_exp, test_act) fit1 = matr_exp.SVM_fit() predicted1 = fit1.predict(test_exp) print ('Actual class') print (test_act) print ('Predicted class') print (predicted1) conf_1 = confusion_matrix(test_act, predicted1) #confusion Matrix print (conf_1) tneg, fpos, fneg, tpos = confusion_matrix(test_act, predicted1).ravel() print(tneg, fpos, fneg, tpos) #true negative, false positive, false negative, true positive acc_1 = accuracy_score(test_act, predicted1) print (acc_1) #accuracy score

Link to the above code.

To summarize this code:

- Generate a random matrix with 100 rows and 20 columns with values of either 0 or 1. First 19 columns will be the explanatory variables and the 20th column will be the response variable
- Split the matrix into training and testing data sets. First 80 rows the training and the last 20 rows for testing
- Perform the classification with the support vector machines
- Use the confusion matrix and accuracy score to evaluate the results. (I have used the sklearn metrics library from the scikit-learn.)

Since the **matrix **we use is a **random generated **one, for every program run, the matrix values and the results will change. The data we analyse here will be for a **specific run.**

I ran the code and got the below** results.**

- Actual Class values: [1 0 0 0 1 1 1 1 1 0 0 0 1 1 0 1 1 0 1 0]
- Predicted Class values: [1 0 0 0 0 1 0 0 0 1 0 1 1 0 0 1 1 0 1 1]
- Confusion Matrix:

[[6 3]

[5 6]]

4. True Negative, False Positive, False Negative, True Positive-6, 3, 5, 6 respectively

5. Accuracy Score: 0.6

We will now try to understand some key concepts and interpret the above results.

a) **True Negatives** are the rejections correctly classified as negative.

In our example, the second, third, fourth, tenth, eleventh, twelfth, fifteenth, eighteenth and twentieth elements are actually zero. Out of these the second, third, fourth, eleventh, fifteenth, eighteenth are correctly predicted as zeros. There are 6 true negatives.

b) **False Positives** are the incorrectly classified positives

In our example, the tenth, twelfth and twentieth elements are actually zero but predicted as ones. There are 3 false positives. False Positive is **Type I **error.

c)** False Negatives **are the incorrectly classified negatives

In our example, the fifth, seventh, eighth, ninth, fourteenth elements are actually one but predicted as zeros. There are 5 false negatives. False Negative is **Type II** error.

d) **True Positives **are correctly classified positives

In our example, the first, sixth, thirteenth, sixteenth, seventeenth and nineteenth elements are actually one and predicted as one. There are 6 True Positives.

The **Confusion Matrix**** **is a matrix where each row represents the actual class instances while each column represents the predicted class instances (**or vice versa**)

The Results we got earlier was:

[[6 3]

[5 6]]

We will put these values in the **Confusion Matrix **as shown below.

**Accuracy **is calculated as (TP + TN)/(TP + TN + FP + FN)

In our example, this will be (6+6)/(6+6+5+3) = 0.6

This is exactly the result we got earlier.

You will also hear the terms **sensitivity **and **specificity **very frequently.

**Sensitivity **or **True Positive Rate **: TP/(TP + FN). In our example 6/(6+5) = 0.55. This is the proportion of the actual positives that are correctly identified as such.

**Specificity **or **True Negative Rate**: TN/(TN + FP). In our example 6/(6+3) = 0.67. This is the proportion of the actual negatives that are correctly identified as such.

**Part 2 talks about regression evaluation metrics**

We will take a look at two regression evaluation metrics — MAE (Mean Absolute Error) and MSE (Mean Squared Error). I have coded the below regression example in Python

""" Regression Metrics Author: Balamurali M """ import numpy as np from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_absolute_error from sklearn.metrics import mean_squared_error #Generating matrix with explanatory and response variable matr = np.random.randint(10, size=(10, 5)) train_exp = matr[:8, :4] train_res = matr[:8, 4:] test_exp = matr[8:, :4] test_act = matr[8:, 4:] class MLR: def __init__(self, w1, x1, y1, z1): self.w1 = w1 self.x1 = x1 self.y1 = y1 self.z1 = z1 def fit_pred(self): LR = LinearRegression() LR.fit(self.w1, self.x1) return LR.predict(self.y1) matr_exp = MLR(train_exp, train_res, test_exp, test_act) predicted = matr_exp.fit_pred() print ('Actual Value') print (test_act) print ('Predicted Value') print (predicted) mae = mean_absolute_error(test_act, predicted) #Mean Absolute Error print (mae) mse = mean_squared_error(test_act, predicted) #Mean Square Error print (mse)

Link to the above code

The code has the following parts:

- Generate a random matrix with 10 rows and 5 columns. First 4 columns will be independent variable x1,x2,x3,x4 and the 5th column will represent the y or response variable.
- Split the matrix into training and testing data sets. First 8 rows are for the training set and the last 2 rows are for testing set.
- Perform multiple linear regression and predict the results using test features (test_exp)
- Use the Mean Absolute Error and Mean Square Error to evaluate results.

Again like in the Part1 classification example, since the matrix we use is a random generated** **one, for every program run, the matrix values and the results will change. The data we analyse here will be for a specific run.

After **executing **the code, I got the below results :

Actual Values (**Response y** values (referred as test_act in the program) in the **Test** Data set): 1, 9

Predicted Values (P**redicted **values (referred as predicted in the program) using Multiple Linear Regression) : 0.399, 6.84

Mean Absolute Error(MAE): 1.377

Mean Squared Error(MSE) : 2.499

**Mean Absolute Error**:

The MAE is the average of sum of all the absolute value of errors, where error is the difference between actual and predicted values. We have 2 set of actual and predicted values. Calculation : 1/2 x (|1–0.399|+|9–6.84|) = 1.38

2. **Mean Squared Error** :

The MSE is the average of sum of all the squares of errors.

Calculation: 1/2 x ((1–0.399)² + (9–6.84)²) = 2.5

Hope this article was helpful to you. Please post your comments and connect with me on Twitter and LinkedIn. Thank you.