1 00:00:00,000 --> 00:00:03,499 [MUSIC] 2 00:00:03,499 --> 00:00:08,013 Okay, so far we've assumed that the only feature relevant to the value of my house, 3 00:00:08,013 --> 00:00:10,620 is the square feet of the house. 4 00:00:10,620 --> 00:00:13,400 But I dig into the data set a little bit more, and 5 00:00:13,400 --> 00:00:17,670 I look at a house that's supposed to be very similar to mine. 6 00:00:17,670 --> 00:00:20,720 It's a house that has very similar square footage. 7 00:00:20,720 --> 00:00:25,890 And so, this house is definitely making an influence on what the predictions are for 8 00:00:25,890 --> 00:00:27,010 my house. 9 00:00:27,010 --> 00:00:31,050 But I look at it and I say, and I look at the specific listing, and 10 00:00:31,050 --> 00:00:33,650 it shows that that house only had one bathroom. 11 00:00:33,650 --> 00:00:36,650 It's actually quite a big house, only one bathroom. 12 00:00:36,650 --> 00:00:39,990 And I say that's really, really not comparable to my house, 13 00:00:39,990 --> 00:00:41,699 which has three bathrooms. 14 00:00:43,160 --> 00:00:46,810 So that the value of that house, that house sales price 15 00:00:46,810 --> 00:00:50,210 really shouldn't be indicative of what my house sales price should be. 16 00:00:52,140 --> 00:00:55,330 So instead what we can think about doing is adding more features. 17 00:00:55,330 --> 00:00:59,050 So instead of just looking at the relationship between square feet and 18 00:00:59,050 --> 00:01:03,120 price, we can add number of bathrooms. 19 00:01:03,120 --> 00:01:03,630 And now for 20 00:01:03,630 --> 00:01:07,100 each one of the listings that I looked at before, I'm gonna have to go through and 21 00:01:07,100 --> 00:01:11,770 record how many square feet that house had, and the number of bathrooms. 22 00:01:11,770 --> 00:01:15,470 And I'm gonna plot each of these points in the 3D space. 23 00:01:15,470 --> 00:01:18,390 Okay? So it's this hyper cube of square 24 00:01:18,390 --> 00:01:21,710 feet versus bathrooms versus price. 25 00:01:21,710 --> 00:01:26,130 And now instead of fitting a line to the data, if I'm thinking about just a very 26 00:01:26,130 --> 00:01:30,360 simple model, I can think about fitting a hyper plane. 27 00:01:30,360 --> 00:01:33,370 Okay, so it's just a slice through the space. 28 00:01:33,370 --> 00:01:34,010 We're here. 29 00:01:34,010 --> 00:01:38,690 This is the equation of the hyper plane, and this is the equation of this plane. 30 00:01:38,690 --> 00:01:43,120 So we have w0, which is our intercept, just where this plane lives up and 31 00:01:43,120 --> 00:01:45,110 down on the y-axis. 32 00:01:45,110 --> 00:01:47,280 And we have w1 times the number of square feet, and 33 00:01:47,280 --> 00:01:50,260 w2 times the number of bathrooms. 34 00:01:50,260 --> 00:01:51,730 But a question is where do we stop? 35 00:01:51,730 --> 00:01:56,340 Do we just want to include the number of bathrooms as our additional feature? 36 00:01:56,340 --> 00:01:58,546 There are lots of things we could think about including. 37 00:01:58,546 --> 00:02:02,244 We could think about in addition to our square feet, number of bathrooms, 38 00:02:02,244 --> 00:02:05,882 there's the number of bedrooms, the lot size, how old the house is, and 39 00:02:05,882 --> 00:02:07,074 the list goes on and on. 40 00:02:07,074 --> 00:02:10,893 In terms of different properties of the house that could be influential in 41 00:02:10,893 --> 00:02:12,193 assessing it's value. 42 00:02:12,193 --> 00:02:17,660 But we're gonna actually hold off on this question of looking at which features 43 00:02:17,660 --> 00:02:23,220 are important for this regression task, until we get to the regression course. 44 00:02:23,220 --> 00:02:25,820 So go to the regression course to learn more about this topic. 45 00:02:25,820 --> 00:02:29,979 [MUSIC]