Logistic RegressionScoring Our Model
In the last chapter, we used Mean Squared Error (MSE) to score our straight lines. But MSE doesn't work very well when we are predicting probabilities. Instead, for Classification problems, we use a special scoring system called Log-Loss (sometimes called Binary Cross-Entropy).
The goal of Log-Loss is simple: reward the model for being right, and heavily punish the model for being confidently wrong.
Prompt: A minimal artistic drawing showing a small "Penalty" gauge that is calmly in the green "0" zone when correct, but broken and exploding into the red "infinity" zone when the model is confidently wrong.
The Intuition
Imagine the true answer is "Yes" (
- If our model predicts a
0.99 probability (it's 99% sure it's a Yes), the model did a great job! The penalty (loss) is almost0 . - But what if the model predicts
0.01 (it's 99% sure it's a No)? It was horribly wrong while being totally confident. The Log-Loss formula will give it a massive, sending the error score skyrocketing toward infinity!
Overall, our goal during training is to:
The Mathematics
For those who love formulas!
Here is the exact formula the computer uses to calculate the penalty for our entire dataset:
- : the true answer (either exactly 0 or exactly 1).
- : our model's predicted probability (between 0 and 1).
Because the true answer is always either 0 or 1, one half of that big formula will always multiply to zero and disappear, leaving only the penalty for the actual outcome!
Look at the interactive chart below. You can see the "penalty curve" dynamically change depending on what the True Value is. Watch how the penalty shoots up to infinity when the model makes a bad guess!