1 00:00:00,000 --> 00:00:04,100 [MUSIC] 2 00:00:04,100 --> 00:00:08,710 We've talked at great length about models where you specify some set of features. 3 00:00:08,710 --> 00:00:11,560 And what that results in is a model with fixed flexibility. 4 00:00:11,560 --> 00:00:15,230 There's a finite capacity to how flexible the model can 5 00:00:15,230 --> 00:00:17,360 be based on the specified features. 6 00:00:17,360 --> 00:00:19,910 But now we're gonna take a totally different approach. 7 00:00:19,910 --> 00:00:22,500 There are approaches that are called nonparametric approaches. 8 00:00:22,500 --> 00:00:25,500 And the one that we're gonna look at in this module 9 00:00:25,500 --> 00:00:28,010 is called k-nearest neighbors and kernel regression. 10 00:00:28,010 --> 00:00:32,754 And what these methods allow you to do is be extremely flexible. 11 00:00:32,754 --> 00:00:36,299 The implementations are very, very simple and they allow 12 00:00:36,299 --> 00:00:41,210 the complexity of the models you infer to increase, as you get more data. 13 00:00:41,210 --> 00:00:44,370 This really simple approach is surprisingly hard to beat. 14 00:00:45,440 --> 00:00:46,880 But as we're gonna see, 15 00:00:46,880 --> 00:00:50,560 all of this relies on us having enough data to use these types of approaches. 16 00:00:52,060 --> 00:00:52,820 To start with, 17 00:00:52,820 --> 00:00:56,920 let's talk about this idea of fitting a function fit globally versus fit locally. 18 00:00:58,560 --> 00:01:02,510 So let's imagine that we have the following data that represent 19 00:01:02,510 --> 00:01:07,800 observations of houses with a given square feet and their associated sales price. 20 00:01:07,800 --> 00:01:11,777 And maybe just for the sake of argument in this module, 21 00:01:11,777 --> 00:01:15,494 let's imagine that for small houses there tends to 22 00:01:15,494 --> 00:01:20,087 be a linear relationship between square feet and house value. 23 00:01:20,087 --> 00:01:22,800 But then that relationship tends to taper off and 24 00:01:22,800 --> 00:01:25,990 there's not much change in house price for square feet. 25 00:01:25,990 --> 00:01:30,514 But then you get into this regime of these really large houses where as 26 00:01:30,514 --> 00:01:32,543 you increase the square feet, 27 00:01:32,543 --> 00:01:37,712 the prices can just go way up because maybe they become very luxurious houses. 28 00:01:37,712 --> 00:01:40,946 In the types of models that we've described so far, we've talked 29 00:01:40,946 --> 00:01:44,970 about fitting some function across the whole input space of square feet. 30 00:01:44,970 --> 00:01:47,060 So for example, if we assume a really simple model, 31 00:01:47,060 --> 00:01:51,350 like just a constant fit, we might get the following fit to our data. 32 00:01:52,600 --> 00:01:55,930 Or if we assume that we're gonna fit some line to the data, 33 00:01:55,930 --> 00:01:59,200 we might get the following or a quadratic function. 34 00:01:59,200 --> 00:02:03,890 And we might say, well, this data kinda looks like there's some cubic fit to it. 35 00:02:03,890 --> 00:02:08,230 And so the cubic fit looks pretty reasonable, but 36 00:02:08,230 --> 00:02:13,540 the truth is that this cubic fit is kind of a bit too flexible, 37 00:02:13,540 --> 00:02:17,440 a bit too complicated for the regions where we have smaller houses. 38 00:02:17,440 --> 00:02:21,188 It looks like it fits very well for our large houses but 39 00:02:21,188 --> 00:02:25,620 it's a little bit too complex for lower values of square feet. 40 00:02:25,620 --> 00:02:28,822 Because, really for this data you could describe it 41 00:02:28,822 --> 00:02:33,790 as having a linear relationship like, I talked about for low square feet value. 42 00:02:33,790 --> 00:02:38,770 And then, just a constant relationship between square feet and value for 43 00:02:38,770 --> 00:02:40,930 some other region. 44 00:02:40,930 --> 00:02:45,950 And then, having this maybe quadratic relationship between square feet and 45 00:02:45,950 --> 00:02:49,750 house value when you get to these really, really large houses. 46 00:02:49,750 --> 00:02:54,179 So this motivates this idea of wanting to fit our function locally to 47 00:02:54,179 --> 00:02:56,755 different regions of the input space. 48 00:02:56,755 --> 00:03:01,990 Or have the flexibility to have a more local description of what's 49 00:03:01,990 --> 00:03:07,480 going on then our models which did these global fits allowed. 50 00:03:07,480 --> 00:03:08,870 So what are we gonna do? 51 00:03:08,870 --> 00:03:14,060 So we want to flexibly define our f(x) the relationship in this 52 00:03:14,060 --> 00:03:18,890 case between square feet and house value to have this type of local structure. 53 00:03:18,890 --> 00:03:23,556 But let's say we don't want to assume that there is what are called structural 54 00:03:23,556 --> 00:03:27,728 breaks that there are certain change points where the structure of our 55 00:03:27,728 --> 00:03:29,510 regression is gonna change. 56 00:03:29,510 --> 00:03:33,530 In that case, you'd have to infer where those break points are. 57 00:03:33,530 --> 00:03:37,628 Instead let's consider a really simple approach that works well when you have 58 00:03:37,628 --> 00:03:38,379 lots of data. 59 00:03:38,379 --> 00:03:42,519 [MUSIC]