Model Evaluation
Accuracy isn't everything. Learn about Precision, Recall, F1-Score, and how to detect Overfitting.
Accuracy isn't everything. Learn about Precision, Recall, F1-Score, and how to detect Overfitting. This hands-on tutorial focuses on practical implementation of model evaluation concepts.
Model Evaluation
You trained a model. It has 99% accuracy. Is it good? Maybe not.
If you are detecting a rare disease (1% of population), a model that always guesses "Healthy" will be 99% accurate but 100% useless.
1. Confusion Matrix π΅
For classification problems, we use a Confusion Matrix to break down errors.
| Predicted Positive | Predicted Negative | |
|---|---|---|
| Actual Positive | True Positive (TP) | False Negative (FN) |
| Actual Negative | False Positive (FP) | True Negative (TN) |
- TP: Sick person correctly identified as Sick.
- TN: Healthy person correctly identified as Healthy.
- FP (Type I Error): Healthy person told they are Sick (False Alarm).
- FN (Type II Error): Sick person told they are Healthy (Missed Detection).
2. Metrics Beyond Accuracy π
Precision
"Of all the people we said were sick, how many actually were?"
Precision = \frac{TP}{TP + FP}
Recall (Sensitivity)
"Of all the people who actually were sick, how many did we find?"
Recall = \frac{TP}{TP + FN}
F1-Score
The harmonic mean of Precision and Recall. Good for imbalanced datasets.
F1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}
3. Overfitting vs. Underfitting π
- Underfitting: The model is too simple (e.g., fitting a line to a curve). High bias.
- Overfitting: The model memorized the training data (including noise). High variance. It fails on new data.
Solution: Cross-Validation.
4. Cross-Validation π
Instead of just one Train/Test split, we split the data into $K$ folds (e.g., 5). We train 5 times, each time using a different fold as the test set.
Interactive Challenge: Calculate Metrics
Quiz
Quiz
Question 1 of 3Which metric is most important for a cancer detection model?
Key Takeaways
β
Accuracy can be misleading.
β
Precision = Quality of positive predictions.
β
Recall = Quantity of positive predictions found.
β
Cross-Validation prevents overfitting.
What's Next?
You have mastered traditional Machine Learning. Now it's time to enter the world of Deep Learning.
Next Module: Module 4 β Deep Learning & Neural Networks.