7.5 Principles of Supervised Learning: Performance Assessment Metrics

Supervised learning is one of the most common approaches in the field of Machine Learning (ML), where a model is trained on a dataset that includes the desired inputs and outputs. The goal is for the model to learn to map inputs to the correct outputs. To evaluate the effectiveness of a supervised learning model, several performance metrics are used. These metrics provide insights into how well the model is performing its tasks and are key to guiding the optimization and validation process.

Accuracy

Accuracy is one of the most intuitive and common metrics. It is defined as the proportion of correct predictions in relation to the total predictions made by the model. Although it is easy to understand and apply, the accuracy can be misleading in imbalanced data sets, where one class is much more frequent than the others.

Precision and Recall

Accuracy is the proportion of correct positive predictions in relation to the total positive predictions made by the model. Recall, also known as sensitivity or true positive rate, is the proportion of real positives that were correctly identified by the model. These two metrics are particularly useful when the costs of false positives and false negatives are very different.

F1 Score

The F1 score is the harmonic mean between precision and recall. It is useful when you need a balance between precision and recall and there is an uneven distribution of classes. The F1 score is especially important in situations where false negatives and false positives have severely different consequences.

Area under the ROC Curve (AUC-ROC)

The ROC (Receiver Operating Characteristic) curve is a graph that shows the performance of a classification model across all classification thresholds. AUC (Area Under the Curve) represents the probability that a model will rank a random positive example higher than a random negative example. AUC-ROC is a robust metric as it is insensitive to class distribution.

Gini index

The Gini Index is another metric derived from the ROC curve. It is calculated as twice the area between the ROC curve and the diagnostic line (which represents a random classifier). The Gini Index is a measure of the model's ability to discriminate between positive and negative classes.

Log Loss

Log Loss, or logarithmic loss, measures the performance of a classification model where the predicted output is a probability between 0 and 1. The punishment for incorrect predictions increases exponentially as the predicted probability diverges from the true class label. Log Loss is an important metric when you need a performance measure that takes into account forecast uncertainty.

Mean Squared Error (MSE) and Root Mean Squared Error (RMSE)

For regression problems, MSE measures the mean squared error, that is, the squared mean of the differences between predicted and actual values. The RMSE is simply the square root of the MSE and has the advantage of being in the same unit as the response variable. Both are crucial metrics for evaluating the performance of regression models.

Mean Absolute Error (MAE)

MAE measures the average of the absolute values ​​of errors. Unlike MSE or RMSE, MAE does not penalize large errors as much, which may be desirable in certain contexts where outliers should not have a large impact on the performance metric.

Final Considerations

When choosing the performance evaluation metric, it is important to consider the context of the problem and what is most important for the application in question. For example, in a fraud detection system, high recall may be more desirable than high precision because it is preferable to flag legitimate transactions as fraudulent (false positives) rather than miss fraudulent transactions (false negatives).

Additionally, it is common to use a set of metrics rather than relying on a single metric to get a more holistic view of model performance. Continuous evaluation and understanding of metrics are essential for developing and improving Machine Learning and Deep Learning models.

Finally, it is important to point out that while some metrics can be calculated easily using ML libraries such as scikit-learn in Python, interpreting the metrics and deciding what actions to take based on this information requires a deep understanding of both model and application domain.

Now answer the exercise about the content:

Which of the following performance metrics of a Machine Learning model is especially useful when seeking a balance between precision and recall and there is an uneven distribution of classes?

You are right! Congratulations, now go to the next page

You missed! Try again.

Article image Supervised Learning Principles: Cross Validation

Next page of the Free Ebook:

27Supervised Learning Principles: Cross Validation

5 minutes

Obtenez votre certificat pour ce cours gratuitement ! en téléchargeant lapplication Cursa et en lisant lebook qui sy trouve. Disponible sur Google Play ou App Store !

Get it on Google Play Get it on App Store

+ 6.5 million
students

Free and Valid
Certificate with QR Code

48 thousand free
exercises

4.8/5 rating in
app stores

Free courses in
video, audio and text