1 00:00:00,000 --> 00:00:04,511 [MUSIC] 2 00:00:04,511 --> 00:00:09,270 Okay, so we've talked about this simple linear regression model, just a line. 3 00:00:09,270 --> 00:00:12,690 And then we talked about how our 4 00:00:12,690 --> 00:00:17,100 goal in fitting this model is gonna be to search over all possible lines, and 5 00:00:17,100 --> 00:00:20,330 find the line that minimizes the residual sum of squares. 6 00:00:20,330 --> 00:00:23,660 We haven't talked about how we're gonna do that, that's gonna come next. 7 00:00:23,660 --> 00:00:25,570 We're gonna talk about specific algorithms for 8 00:00:25,570 --> 00:00:29,450 searching over all possible lines to minimize residual sum of squares. 9 00:00:29,450 --> 00:00:33,310 But right now, let's just assume that we have some fitted line and 10 00:00:33,310 --> 00:00:37,500 let's talk about how we're gonna use it and how we can interpret the parameters. 11 00:00:39,200 --> 00:00:42,871 Let's start by contrasting the model from the fitted line. 12 00:00:42,871 --> 00:00:48,420 So, our model, which I've written again here, 13 00:00:48,420 --> 00:00:51,465 is in terms of parameters. 14 00:00:51,465 --> 00:00:56,808 These are parameters, 15 00:00:56,808 --> 00:01:00,816 they're unknown 16 00:01:00,816 --> 00:01:06,432 variables of our model. 17 00:01:06,432 --> 00:01:11,823 And our goal is to estimate these parameters, fix values of these 18 00:01:11,823 --> 00:01:17,421 variables based on our data, and so that's what w hat represents. 19 00:01:17,421 --> 00:01:21,534 These are estimated parameters, 20 00:01:21,534 --> 00:01:25,084 so these take actual values. 21 00:01:31,941 --> 00:01:35,893 So for example, maybe our 22 00:01:35,893 --> 00:01:40,925 intercept is -44,850 and 23 00:01:40,925 --> 00:01:45,245 our slope is 280.76. 24 00:01:45,245 --> 00:01:48,429 And these aren't random numbers that I generated here, 25 00:01:48,429 --> 00:01:50,360 this comes from the first course. 26 00:01:51,880 --> 00:01:59,750 If you look at the notebook for doing the regression fade of sales price on 27 00:01:59,750 --> 00:02:04,290 square feet, these were the numbers that came out for these two parameters. 28 00:02:05,380 --> 00:02:11,630 Okay, so these estimated parameters define a specific line. 29 00:02:11,630 --> 00:02:12,740 Okay, that's what this is. 30 00:02:12,740 --> 00:02:20,063 This line represents -44,850 31 00:02:20,063 --> 00:02:25,110 + 280.76 times x. 32 00:02:25,110 --> 00:02:28,250 So, a model is in terms of sum parameters and 33 00:02:28,250 --> 00:02:33,020 a fitted line is a specific example within that model class. 34 00:02:33,020 --> 00:02:35,330 So, it's a specific line. 35 00:02:35,330 --> 00:02:39,090 Now let's talk about how we can think about using our fitted line. 36 00:02:40,800 --> 00:02:44,190 So one thing that we can ask is 37 00:02:44,190 --> 00:02:48,760 to predict the value of this house that we'd like to list for sale. 38 00:02:48,760 --> 00:02:51,990 So this house has some number of square feet, and 39 00:02:51,990 --> 00:02:55,960 we'd like to guess the price, and how are we gonna guess the price? 40 00:02:55,960 --> 00:02:57,091 Well, it's very straightforward. 41 00:02:57,091 --> 00:03:01,031 We're just gonna plug into our fitted line. 42 00:03:01,031 --> 00:03:07,781 And so we're gonna take the square feet of this house, which is represented here. 43 00:03:07,781 --> 00:03:11,970 Plug that into the equation of this line, and 44 00:03:11,970 --> 00:03:18,270 that's gonna produce our estimated value of the house right here. 45 00:03:18,270 --> 00:03:21,888 I want to mention one more thing here. 46 00:03:21,888 --> 00:03:27,425 So I wanna mention that y hat, that's our predicted value of the house. 47 00:03:27,425 --> 00:03:33,620 Y hat is exactly equal to f hat(x), okay? 48 00:03:33,620 --> 00:03:37,400 Our model, remember our model said that 49 00:03:39,870 --> 00:03:43,600 yi is approximately equal to f(xi). 50 00:03:46,650 --> 00:03:50,850 But when we go to do a prediction, y hat. 51 00:03:50,850 --> 00:03:54,700 That's exactly equal to f hat(x). 52 00:03:54,700 --> 00:03:55,756 And why is that? 53 00:03:55,756 --> 00:04:03,980 That's because the error is equally likely to be above or below the line. 54 00:04:03,980 --> 00:04:09,841 And f hat represents our best guess of the line given our data. 55 00:04:09,841 --> 00:04:14,353 And so if we wanna guess the value of the house, 56 00:04:14,353 --> 00:04:20,488 we're unsure if that error for that specific house was above or 57 00:04:20,488 --> 00:04:26,871 below the line, so our best guess is to put it exactly on the line. 58 00:04:26,871 --> 00:04:29,883 Well I can use this fitted line also if I'm a buyer, 59 00:04:29,883 --> 00:04:33,330 not just if I'm a seller just in listing my house for sale. 60 00:04:34,330 --> 00:04:37,780 So for example, what I mean here is that I might have 61 00:04:37,780 --> 00:04:40,680 some amount of money that I can spend on a house. 62 00:04:40,680 --> 00:04:44,710 And I want to know how a big of a house can I expect to purchase 63 00:04:44,710 --> 00:04:46,600 with that amount of money. 64 00:04:46,600 --> 00:04:52,370 Well instead of estimating or predicting 65 00:04:52,370 --> 00:04:57,790 the value of the house, I can think of predicting the square feet. 66 00:04:57,790 --> 00:05:01,120 So just using this equation in reverse. 67 00:05:01,120 --> 00:05:03,270 So I have some amount of money in the bank. 68 00:05:04,630 --> 00:05:11,281 And that's gonna be $ in bank is this. 69 00:05:11,281 --> 00:05:15,352 And then I'm gonna look at how many square feet, 70 00:05:20,041 --> 00:05:24,540 I believe that I can purchase with that amount of money. 71 00:05:24,540 --> 00:05:27,960 Okay, so let's just go through a concrete example. 72 00:05:27,960 --> 00:05:31,501 Here's the fitted regression line I talked about before. 73 00:05:31,501 --> 00:05:36,582 So it's -44,850 + 280.76 times x. 74 00:05:36,582 --> 00:05:41,500 And what I'd like to do is predict the value of a house that 75 00:05:41,500 --> 00:05:44,080 has 2,640 square feet. 76 00:05:44,080 --> 00:05:50,120 So I do, it's just plug in 2,640 into this equation. 77 00:05:50,120 --> 00:05:52,840 That's my y hat. 78 00:05:56,051 --> 00:06:03,331 And that comes out to be, if you do that calculation, $696,356. 79 00:06:03,331 --> 00:06:06,217 Likewise, I can say, well, 80 00:06:06,217 --> 00:06:11,874 I wanna predict how many square feet of house I can purchase 81 00:06:11,874 --> 00:06:16,740 if I have $859,000 to spend on the house. 82 00:06:16,740 --> 00:06:21,954 So if you do this calculation, you get that you can afford a house or 83 00:06:21,954 --> 00:06:27,185 you expect to buy a house that's roughly 3,219 square feet. 84 00:06:27,185 --> 00:06:31,369 [MUSIC]