When we evaluate the performance of a classification model, for each line of our data set, we compare the predicted value against the actual value of the label feature.
We can then build a table called “Confusion Matrix” to count the good/bad predictions, and summarize the result by some performance metrics Precision, Recall, F1 Score.
We can use the results to compare different classification models.
Confusion Matrix
It shows the actual vs. predicted classifications:
Predicted Positive (Mail is Spam) | Predicted Negative (Mail is not Spam) | |
---|---|---|
Actual Positive (Mail is Spam) | True Positive | False Negative (Type 2 Error) |
Actual Negative (Mail is not Spam) | False Positive (Type 1 Error) | True Negative |
Impact of errors:
- A Type 1 Error is a “false alarm”
- A Type 2 Error is a “miss”
Performance 4 metrics:
1) Precision= True Positive Predictions / All Positive Predictions
- “Of all the positive predictions the model made, how many were actually correct?”
- High precision means that the classifier is good at avoiding false positives
2) Recall= True Positive Predictions / All Actual Positive
- “Of all the actual positive instances, how many did the model correctly identify?”
- All actual positive are True Positive and False Negative
3) Accuracy = Correct Predictions
/ All Predictions
- Accuracy measures overall model performance
4) F1 Score= 2 × (Precision × Recall) / (Precision + Recall)
- Harmonic mean of precision and recall
- A high F1 score means that both precision and recall are high. The model is performing well
0 Comments