Evaluate the Model (Optional)

Optionally, you can evaluate the performance of your model using various metrics like accuracy, precision, recall, or F1-score. The tidymodels framework provides functions for model evaluation, making it easy to assess your model’s performance.

R




# Model performance metrics
confusion_matrix <- confusionMatrix(predict(knn_spec), iris$Species)
accuracy <- confusion_matrix$overall["Accuracy"]
recall <- confusion_matrix$byClass["Recall"]
precision <- confusion_matrix$byClass["Precision"]
 
print(confusion_matrix)
print(accuracy)
print(recall)
print(precision)


Output:

Confusion Matrix and Statistics
            Reference
Prediction   setosa versicolor virginica
  setosa         50          0         0
  versicolor      0         50         2
  virginica       0          0        48
Overall Statistics
                                          
               Accuracy : 0.9867          
                 95% CI : (0.9527, 0.9984)
    No Information Rate : 0.3333          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.98            
                                          
 Mcnemar's Test P-Value : NA              
Statistics by Class:
                     Class: setosa Class: versicolor Class: virginica
Sensitivity                 1.0000            1.0000           0.9600
Specificity                 1.0000            0.9800           1.0000
Pos Pred Value              1.0000            0.9615           1.0000
Neg Pred Value              1.0000            1.0000           0.9804
Prevalence                  0.3333            0.3333           0.3333
Detection Rate              0.3333            0.3333           0.3200
Detection Prevalence        0.3333            0.3467           0.3200
Balanced Accuracy           1.0000            0.9900           0.9800
 Accuracy 
0.9866667 
[1] NA
[1] NA

  • The confusion matrix shows the model’s predictions compared to the actual class labels for three classes: setosa, versicolor, and virginica.
  • The overall accuracy of the model is 0.9867, indicating that it correctly classified 98.67% of the instances.
  • Recall Measures how well the model correctly identifies each class. For example, it has a sensitivity of 1.0000 for “setosa,” meaning it correctly identifies all “setosa” instances.

In summary, the output provides a comprehensive assessment of the model’s classification performance, including accuracy, precision, recall, and other related statistics for each class in the dataset. The model appears to perform very well, with high accuracy and good class-specific metrics.

Predictions Multiple outcomes with KNN Model Using tidymodels

When dealing with classification problems that involve multiple classes or outcomes, it’s essential to have a reliable method for making predictions. One popular algorithm for such tasks is k-Nearest Neighbors (k-NN). In this tutorial, we will walk you through the process of making predictions with multiple outcomes using a k-NN model in R, specifically with the tidymodels framework.

K-Nearest Neighbors (KNN) is a simple yet effective supervised machine learning algorithm used for classification and regression tasks. Here’s an explanation of KNN and some of its benefits:

Similar Reads

K-Nearest Neighbors (KNN):

KNN is a non-parametric algorithm, meaning it doesn’t make any underlying assumptions about the distribution of data. It’s an instance-based or memory-based learning algorithm, which means it memorizes the entire training dataset and uses it to make predictions. The fundamental idea behind KNN is to classify a new data point by considering the majority class among its K-nearest neighbors....

Tidymodels

Tidymodels is a powerful and user-friendly ecosystem for modeling and machine learning in R. It provides a structured workflow for creating, tuning, and evaluating models. Before we proceed, make sure you have tidymodels and the necessary packages installed. You can install them using:...

Pre-Requisites

...

Load Required Libraries and Data

Before moving forward make sure you have Caret and ggplot packages installed....

Preprocess Data

...

Create and Train the k-NN Model

We’ll start by loading the necessary libraries and a dataset. For this tutorial, we’ll use the classic Iris dataset, which contains three different species of iris flowers (setosa, versicolor, and virginica)....

Make Predictions

...

Evaluate the Model (Optional)

Data preprocessing is crucial for building a robust model. In this step, we’ll create a recipe to preprocess the data. In our case, we don’t need any preprocessing since the Iris dataset is well-structured and doesn’t have any missing values....

Performing KNN on MTCars Dataset

...

Conclusion

Now, it’s time to create and train our k-NN model. We’ll use the `nearest_neighbor()` function from the `parsnip` package, which is part of tidymodels....

Contact Us