[MUSIC] We've talked at great length about models where you specify some set of features. And what that results in is a model with fixed flexibility. There's a finite capacity to how flexible the model can be based on the specified features. But now we're gonna take a totally different approach. There are approaches that are called nonparametric approaches. And the one that we're gonna look at in this module is called k-nearest neighbors and kernel regression. And what these methods allow you to do is be extremely flexible. The implementations are very, very simple and they allow the complexity of the models you infer to increase, as you get more data. This really simple approach is surprisingly hard to beat. But as we're gonna see, all of this relies on us having enough data to use these types of approaches. To start with, let's talk about this idea of fitting a function fit globally versus fit locally. So let's imagine that we have the following data that represent observations of houses with a given square feet and their associated sales price. And maybe just for the sake of argument in this module, let's imagine that for small houses there tends to be a linear relationship between square feet and house value. But then that relationship tends to taper off and there's not much change in house price for square feet. But then you get into this regime of these really large houses where as you increase the square feet, the prices can just go way up because maybe they become very luxurious houses. In the types of models that we've described so far, we've talked about fitting some function across the whole input space of square feet. So for example, if we assume a really simple model, like just a constant fit, we might get the following fit to our data. Or if we assume that we're gonna fit some line to the data, we might get the following or a quadratic function. And we might say, well, this data kinda looks like there's some cubic fit to it. And so the cubic fit looks pretty reasonable, but the truth is that this cubic fit is kind of a bit too flexible, a bit too complicated for the regions where we have smaller houses. It looks like it fits very well for our large houses but it's a little bit too complex for lower values of square feet. Because, really for this data you could describe it as having a linear relationship like, I talked about for low square feet value. And then, just a constant relationship between square feet and value for some other region. And then, having this maybe quadratic relationship between square feet and house value when you get to these really, really large houses. So this motivates this idea of wanting to fit our function locally to different regions of the input space. Or have the flexibility to have a more local description of what's going on then our models which did these global fits allowed. So what are we gonna do? So we want to flexibly define our f(x) the relationship in this case between square feet and house value to have this type of local structure. But let's say we don't want to assume that there is what are called structural breaks that there are certain change points where the structure of our regression is gonna change. In that case, you'd have to infer where those break points are. Instead let's consider a really simple approach that works well when you have lots of data. [MUSIC]