Train, Validation, and Test sets in Machine Learning

Explained with an everyday example

James Thorn
3 min readJun 21, 2022

Hello dear reader.

In this article, I will explain what the train, validation, and test sets are in Machine Learning and use a very clear, everyday example to make sure you grasp their differences. Ready? Let us get to it then!

But before we start, a few Machine Learning goodies:
- For
learning resources go to How to Learn Machine Learning!
-
Subscribe to my newsletter for more articles like this one and free premium content!

Introduction

Machine learning from models learn from experience. They need data to train on.

Aside from this, they need data to be tested on. If not, how would we evaluate that our model has learned well from the training data and that it can apply the learned knowledge to new data?

This means that when using a data set to train a machine learning model, data scientists break it down into 3 pieces (sometimes 2 is enough when we perform tasks like cross-validation on the training set).

These 3 pieces are:

  • Training set: the data that the machine learning model will be trained on. This is the data that our model will…

--

--