1 00:00:00,025 --> 00:00:04,556 [MUSIC] 2 00:00:04,556 --> 00:00:08,219 We've learned now, together, we've learned two regression models from our data, 3 00:00:08,219 --> 00:00:09,924 one just based on square foot of living, 4 00:00:09,924 --> 00:00:12,510 the other one based on these more advanced features. 5 00:00:12,510 --> 00:00:14,460 So let's apply these models. 6 00:00:14,460 --> 00:00:18,400 Let's see what it looks like when we use them in practice to predict house prices. 7 00:00:18,400 --> 00:00:23,655 So, what we're going to do is apply this learned 8 00:00:23,655 --> 00:00:28,526 models to predict prices of three houses from 9 00:00:28,526 --> 00:00:33,155 the data set, so three different houses. 10 00:00:33,155 --> 00:00:38,535 So the first house is, and this is three that I picked 11 00:00:38,535 --> 00:00:43,919 that had different sizes, different properties, 12 00:00:43,919 --> 00:00:49,930 they look a bit different, in different neighborhoods. 13 00:00:49,930 --> 00:00:51,180 So let's just look at them. 14 00:00:51,180 --> 00:00:56,190 So the first house is from our data and 15 00:01:00,170 --> 00:01:06,660 so what we're gonna do with this house is, so this is for not data but sales data. 16 00:01:06,660 --> 00:01:11,792 So out of my sales data, I'm going to 17 00:01:11,792 --> 00:01:19,251 select the house whose id is equal to a particular id. 18 00:01:19,251 --> 00:01:24,119 And so the county record gives us string ids for every house, it gives a number. 19 00:01:24,119 --> 00:01:30,160 So this one is 5309, I'm copying from a piece of paper here, 101200. 20 00:01:30,160 --> 00:01:35,540 So, a particular house for a particular id, and so I'm selecting it 21 00:01:36,990 --> 00:01:42,440 and if I type house1, you'll see what it is. 22 00:01:42,440 --> 00:01:45,165 So it's the house that has this particular ID. 23 00:01:45,165 --> 00:01:48,430 It got sold on this date for $620,000. 24 00:01:48,430 --> 00:01:52,250 It had four bedrooms, two and 25 00:01:52,250 --> 00:01:57,150 a quarter bathrooms, 2,400 square feet of living space, and so on. 26 00:01:58,610 --> 00:02:04,060 Now what you can do is in iPython notebook, 27 00:02:04,060 --> 00:02:11,050 you can embed not only mark ups and texts in the Python code. 28 00:02:11,050 --> 00:02:13,340 But you can also embed HTML and images. 29 00:02:13,340 --> 00:02:16,800 So let me just do that for you right now, just as an example. 30 00:02:16,800 --> 00:02:23,970 So I'm typing xm and then I am going to add an image to my notebook and this image 31 00:02:23,970 --> 00:02:28,008 is an image that I downloaded from the county records for this particular house. 32 00:02:28,008 --> 00:02:34,330 So, the image is on my directory, it's called house. 33 00:02:37,862 --> 00:02:41,761 So, house, 53, what was the number? 34 00:02:41,761 --> 00:02:47,740 09101200, so this is gonna be pretty cool. 35 00:02:47,740 --> 00:02:52,569 When I see a picture of the house we're trying to make a prediction for, oops, 36 00:02:52,569 --> 00:02:54,998 I must have typed that wrong oh, yeah. 37 00:02:54,998 --> 00:02:56,321 I forgot it's .jpg. 38 00:02:58,862 --> 00:03:02,880 And that's pretty cool, here's a picture of the house we're pricing. 39 00:03:02,880 --> 00:03:07,090 So this house was sold in 2014 for $620,000. 40 00:03:07,090 --> 00:03:09,830 It has four bedrooms, two and a quarter bathrooms. 41 00:03:09,830 --> 00:03:13,370 It was built, for example, in 1929. 42 00:03:13,370 --> 00:03:20,473 Now, let's see what our model predicts. 43 00:03:20,473 --> 00:03:24,300 So, We had two models. 44 00:03:24,300 --> 00:03:26,780 So, we had first. 45 00:03:26,780 --> 00:03:28,100 So the true price of the house. 46 00:03:28,100 --> 00:03:33,411 And just a print out to remind ourselves, so this is house1 of price. 47 00:03:34,560 --> 00:03:41,318 This is 200 so if I just type print here, it will look a little nicer. 48 00:03:41,318 --> 00:03:46,370 $620,000, this is the house price. 49 00:03:46,370 --> 00:03:51,670 Then let's see what my first model predicts. 50 00:03:51,670 --> 00:03:55,745 So that was the simple square foot model that we built. 51 00:03:55,745 --> 00:04:00,446 When you they it, so whenI typed .predict, 52 00:04:00,446 --> 00:04:05,640 the price of house1, and it says it predicts it as 53 00:04:05,640 --> 00:04:10,973 $628,000, so pretty close, actually. 54 00:04:10,973 --> 00:04:11,720 That's really good. 55 00:04:13,430 --> 00:04:17,000 And let's see what adding more features did for us. 56 00:04:18,380 --> 00:04:22,721 my_features_model.predict:house1, so 57 00:04:22,721 --> 00:04:27,163 you remember on average, the my features model, 58 00:04:27,163 --> 00:04:32,150 adding more features, gives you better performance. 59 00:04:34,300 --> 00:04:39,769 Now, if you look at the prediction here, it's $720,000, so the square 60 00:04:39,769 --> 00:04:43,760 foot model had a better prediction than the one with more features. 61 00:04:43,760 --> 00:04:46,465 So what we learned here is, even though on average, 62 00:04:46,465 --> 00:04:48,980 adding more features give a better prediction. 63 00:04:48,980 --> 00:04:50,670 For this one particular house, 64 00:04:50,670 --> 00:04:54,520 the simple model did a better job than the more advanced model. 65 00:04:54,520 --> 00:04:58,070 But this is to be expected, and not always, but 66 00:04:58,070 --> 00:05:01,740 sometimes gonna do better, probably works, when average is going to do better. 67 00:05:03,675 --> 00:05:07,939 [MUSIC]