1 00:00:00,000 --> 00:00:04,298 [MUSIC] 2 00:00:04,298 --> 00:00:07,320 And so let's actually use the model. 3 00:00:08,430 --> 00:00:15,960 So, next we've now trained a liner regression model, let's evaluate it. 4 00:00:15,960 --> 00:00:20,834 So, what we're gonna do next is 5 00:00:20,834 --> 00:00:25,540 #Evaluate the simple model. 6 00:00:25,540 --> 00:00:26,680 So how to evaluate it? 7 00:00:26,680 --> 00:00:28,900 We're gonna look at the test data. 8 00:00:28,900 --> 00:00:32,411 So, remember we had the split test data. 9 00:00:32,411 --> 00:00:34,980 Let's understand test data a little bit. 10 00:00:34,980 --> 00:00:37,380 So, for example, let's print. 11 00:00:38,940 --> 00:00:43,090 For the test data, for 12 00:00:43,090 --> 00:00:47,210 the price column, what's the average price? 13 00:00:47,210 --> 00:00:48,330 What's the mean price? 14 00:00:50,050 --> 00:00:53,230 So this just computes the average price and the average price for 15 00:00:53,230 --> 00:00:56,790 the test data, for this data from Seattle is $543,000. 16 00:00:56,790 --> 00:01:03,230 That's how average house costs, it's pretty expensive actually. 17 00:01:03,230 --> 00:01:06,760 Now, we've built a square foot model, and so 18 00:01:06,760 --> 00:01:11,080 what we want to do is evaluate it on this test data. 19 00:01:11,080 --> 00:01:16,080 So we're going to take the sqft_model 20 00:01:16,080 --> 00:01:21,150 that we built and we're gonna call what's called evaluate function. 21 00:01:21,150 --> 00:01:25,910 Which can take a test data set and print out or 22 00:01:25,910 --> 00:01:30,400 return some statistics of how well that fit is doing. 23 00:01:32,370 --> 00:01:34,310 So let's do that. 24 00:01:36,140 --> 00:01:40,240 So, I'm actually going to type, print in the beginning, 25 00:01:40,240 --> 00:01:42,860 because it formats it a little bit nicer. 26 00:01:42,860 --> 00:01:50,850 And you'll see that the maximal error over all test houses was 4.1 million. 27 00:01:50,850 --> 00:01:55,040 So there was one house that was an outlier, it was really badly predicted. 28 00:01:55,040 --> 00:01:59,856 And average error, so the RMSE, so the root means squared error, 29 00:01:59,856 --> 00:02:01,496 we talked about this. 30 00:02:01,496 --> 00:02:06,941 Emily talked about this with us during the lectures, 31 00:02:06,941 --> 00:02:11,550 is $255,000, so that's the RMSE. 32 00:02:11,550 --> 00:02:18,103 So we built this simple model, we tested, has pretty high RMSE, 33 00:02:18,103 --> 00:02:24,791 but let's look at some predictions it tries to make from the data. 34 00:02:24,791 --> 00:02:29,099 [MUSIC]