Evaluation metrics in R
Step 1: Loading the necessary package
You can use various packages and functions to compute classification evaluation metrics in R. One commonly used package is “caret,” which provides a unified interface for many machine learning tasks, including evaluation.
Install the “caret” package, which stands for Classification And REgression Training. The “caret” package is a comprehensive framework in R for performing machine learning tasks, including data preprocessing, model training, and evaluation. Load the “caret” package into the R environment so that its functions and capabilities can be used.
R
# Install the package caret install.packages ( "caret" ) # Load the package into R program library (caret) |
Step 2: Loading Dataset
For this example, we use the “iris” dataset. Load the “iris” dataset, which is a famous dataset included in R. It contains measurements of different features of iris flowers (sepal length, sepal width, petal length, petal width) along with their corresponding species (setosa, versicolor, virginica).
Print a summary of the “iris” dataset, providing descriptive statistics such as mean, median, minimum, maximum, and quartiles for each feature.
R
# Load the dataset iris data (iris) # Print the summary of the dataset with summary function summary (iris) |
Output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 setosa :50
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 versicolor:50
Median :5.800 Median :3.000 Median :4.350 Median :1.300 virginica :50
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
This code block creates a scatter plot matrix of the “iris” dataset. It shows pairwise scatter plots of each feature against other features, allowing us to visualize the relationships and distributions between variables.
R
# Plot Pairs Scatter plot plot (iris) |
Output:
Uses the createDataPartition
function from the “caret” package to split the data into a training set and a test set. It randomly selects 80% of the data for training by creating an index vector (trainIndex
) that identifies the selected rows.
iris$Species
: This parameter specifies the target variable or outcome variable. In this case, it represents the species of the iris flowers.p
: This parameter indicates the proportion of data to be allocated to the training set. In this example, it is set to 0.8, which means 80% of the data will be used for training.list
: This parameter determines the format of the output. Iflist
is set toFALSE
, the function returns a numeric vector with the indices of the selected rows. Iflist
is set toTRUE
, it returns a list of indices for the selected rows.
Create two new datasets: trainData
and testData
. trainData
contains 80% of the “iris” dataset, selected based on the trainIndex
, while testData
contains the remaining 20% of the dataset.
R
# Find the index of data with 80% split in matrix format trainIndex <- createDataPartition (iris$Species, p = 0.8, list = FALSE ) # Store the 80% in trainData trainData <- iris[trainIndex, ] # Remaining 20% in testData testData <- iris[-trainIndex, ] |
Step 4: Model training
Installs the “randomForest” package, which provides the functionality to build random forest models. Random forests are an ensemble learning method that combines multiple decision trees to make predictions.
Use the train
function from the “caret” package to train a classification model using the random forest method (method = "rf"
). The formula Species ~ .
specifies that the target variable is “Species,” and the remaining columns in the trainData
dataset are used as predictors. The trained model is stored in the model
variable.
Species ~ .
: This formula specifies the relationship between the target variable (Species
) and the predictor variables. The~
symbol separates the target variable from the predictors, and the.
indicates that all other columns in the dataset (trainData
) are used as predictors.data
: This parameter specifies the training dataset on which the model will be trained. In this case, it istrainData
.method
: This parameter indicates the machine learning algorithm or method to be used for training the model. In this example, “rf” refers to the random forest algorithm.
R
# Install random forest library install.packages ( "randomForest" ) # load library library (randomForest) # Train a classification model model <- train (Species ~ ., data = trainData, method = "rf" ) |
Step 5: Evaluating Metrics
To calculate the evaluation metrics, test the model on the test data and save the answer in the predictions variable. use the confusionMatrix
function from the “caret” package to calculate the confusion matrix and related metrics for the trained model’s predictions. The predictions
argument should contain the predicted values, and testData$Species
contains the actual species values from the test dataset. The resulting confusion matrix is stored in the cm
variable.
predictions
: This parameter represents the predicted values obtained from the trained model. It should be a vector or factor containing the predicted class labels.testData$Species
: This parameter provides the actual class labels from the test dataset. It represents the ground truth or true labels against which the predictions are compared.
R
# Make predictions on the test set predictions <- predict (model, newdata = testData) # Use the confusionmatrix function and pass in the predicted and real test values cm<- confusionMatrix (predictions, testData$Species) cm |
Output:
Confusion Matrix and Statistics
Reference
Prediction setosa versicolor virginica
setosa 10 0 0
versicolor 0 10 1
virginica 0 0 9
Overall Statistics
Accuracy : 0.9667
95% CI : (0.8278, 0.9992)
No Information Rate : 0.3333
P-Value [Acc > NIR] : 2.963e-13
Kappa : 0.95
Mcnemar's Test P-Value : NA
Statistics by Class:
Class: setosa Class: versicolor Class: virginica
Sensitivity 1.0000 1.0000 0.9000
Specificity 1.0000 0.9500 1.0000
Pos Pred Value 1.0000 0.9091 1.0000
Neg Pred Value 1.0000 1.0000 0.9524
Prevalence 0.3333 0.3333 0.3333
Detection Rate 0.3333 0.3333 0.3000
Detection Prevalence 0.3333 0.3667 0.3000
Balanced Accuracy 1.0000 0.9750 0.9500
Now the confusion matrix function results are stored in variable cm. We can access the accuracy, kappa, etc scores by using the argument overall to the variable cm. The overall metrics the measure the correctness or any other evaluation by using whole data and not separated by class i.e. it considers the whole result and not restricted by categories or not influenced by classes are shown in it.
To see the overall evaluation metrics use $overall and to see the classwise evaluation metrics use $byClass
R
cm$byClass |
,
Sensitivity Specificity Pos Pred Value Neg Pred Value Precision
Class: setosa 1.0 1.00 1.0000000 1.000000 1.0000000
Class: versicolor 1.0 0.95 0.9090909 1.000000 0.9090909
Class: virginica 0.9 1.00 1.0000000 0.952381 1.0000000
Recall F1 Prevalence Detection Rate Detection Prevalence
Class: setosa 1.0 1.0000000 0.3333333 0.3333333 0.3333333
Class: versicolor 1.0 0.9523810 0.3333333 0.3333333 0.3666667
Class: virginica 0.9 0.9473684 0.3333333 0.3000000 0.3000000
Balanced Accuracy
Class: setosa 1.000
Class: versicolor 0.975
Class: virginica 0.950
As you can see above, we get all the classwise evaluation scores. From this, we can observe the sensitivity, Specificity, Precision, Recall, and F1 Score.
R
cm$overall |
Output:
Accuracy Kappa AccuracyLower AccuracyUpper AccuracyNull
9.666667e-01 9.500000e-01 8.278305e-01 9.991564e-01 3.333333e-01
AccuracyPValue McnemarPValue
2.962731e-13 NaN
The cm$byClass
line retrieves the performance metrics (such as accuracy, precision, recall, and F1 score) for each class separately. The cm$overall
the line provides overall performance metrics across all classes.
Computing Classification Evaluation Metrics in R
Classification evaluation metrics are quantitative measures used to assess the performance and accuracy of a classification model. These metrics provide insights into how well the model can classify instances into predefined classes or categories.
The commonly used classification evaluation metrics are: