[MUSIC] Okay, so we can't compute generalization error, but we want some better measure of our predictive performance than training error gives us. And so this takes us to something called test error, and what test error is going to allow us to do is approximate generalization error. And the way we're gonna do this is by approximating the error, looking at houses that aren't in our training set. So to do that, we have to hold out some houses. So instead of including all these colored houses in our training set, which is these colored houses are our entire recorded data set, we're gonna shade out some of them, these shaded gray houses and we're gonna make these into what's called a test set. Okay. So here we have houses that are not included in our training set, the training set are the remaining colored houses here. And when we go to fit our models, we're just going to fit our models on the training data set. But then when we go to assess our performance of that model, we can look at these test houses, and these are hopefully going to serve as a proxy of everything out there in the world. So hopefully, our test data set is a good measure of other houses that we might see, or at least in order to think of how well a given model is performing. Okay, so test error is gonna be our average loss computed over the houses in our test data set. So formally, we write it as follows where we have one over N test. N test are the number of houses in our test data set times the sum of the loss defined over those test set houses. But I wanna emphasize, and this is really, really important, that the estimated parameters W hat were fit on the training data set. Okay, so even though this function looks very, very, very much like training error, the sum is over the test houses, but the function we're looking at was fit on training data. Okay, so these parameters in this fitted function never saw the test data. So just to illustrate this, like in our previous example, we might think of fitting a quadratic function through this data, where we're gonna minimize the residual sum of squares on the training points, those blue circles, to get our estimated parameters W hat. Then when we go to compute our test error, which in this case again we're gonna use squared error as an example, we're computing this error over the test points, all these grey different circles here. So test error is 1 over N times the sum of the difference between our true house sales prices and our predicted price squared summing over all houses in our test data set. Okay, so this is where the difference arises, where this function was fit with the blue circles. The one we're assessing, the performance, we're looking at these grey circles. Okay, so let's summarize our measures of error as a function of model complexity. And what we saw was that our training error decreased with increasing model complexity. So here, this is our training error. And in contrast, our generalization error went down for some period of time. But then we started getting to overly complex models that didn't generalize well, and the generalization error started increasing. So here we have generalization error. Or true error. And what is our test error? Well, our test error is a noisy approximation of generalization error. Because if our test data setting included everything we might ever see in the world in proportion to how likely it was to be seen, then that would be exactly our generalization error. But of course, our test data set is just some finite data set, and we're using it to approximate generalization error, so it's gonna be some noisy version of this curve here. So this is our test error. Okay, so test error is the thing that we can actually compute. Generalization error is the thing that we really want. [MUSIC]