Testing Your AIK-Fold Cross-Validation
Instead of randomly hoping we get a fair split, what if we just tested the AI on every single piece of data? This is the brilliant concept behind K-Fold Cross-Validation.
Here is exactly how this clever strategy works:
- Slice it up: We take our entire available dataset and chop it into equal-sized chunks, called folds. The mathematical letter we use to represent the number of chunks is .
- Rotate and Test: We pick one single fold to act as our validation exam, and use all the remaining folds as the training textbook. We train the AI, grade it on the validation exam, and record its score.
- Repeat: We throw away the AI's memory and start all over! We pick a different chunk to be the validation exam, and train on the rest. We repeat this process times, until every single chunk has been used exactly once as a validation exam!
- Average the Grade: Finally, we take all of those scores and average them together.
Below we visualize this process using folds. Watch how the distribution of training (blue) and validation (red) sets shift in a complete cycle, while the test set (yellow) safely remains completely untouched throughout the entire process!
A Fair Assessment
This simple rotating extension gives us a massive boost in reliability. Because every single data point gets to act as both a training example and a validation question at some point, the final averaged score gives us an incredibly confident, low-variance estimate of our model's true capabilities.
There is, however, one major catch to evaluating an AI this thoroughly. In order to pull this off, you literally have to wipe the AI's brain and retrain an entirely new model from scratch
Because K-Fold requires re-fitting your model times (once for each fold), it naturally demands significantly more computational power and time than checking a single simple split!