1 00:00:00,241 --> 00:00:04,383 [MUSIC] 2 00:00:04,383 --> 00:00:07,910 So, to start with, let's look at the machine learning model. 3 00:00:07,910 --> 00:00:10,009 This multiple regression model. 4 00:00:10,009 --> 00:00:14,244 And let's just recall where we left off in the last module, where we were talking 5 00:00:14,244 --> 00:00:18,560 about simple linear regression, where our goal was just to fit a line to the data. 6 00:00:19,890 --> 00:00:21,930 And we just had a single input. 7 00:00:21,930 --> 00:00:25,370 And in our example, we always talked about having square feet and 8 00:00:25,370 --> 00:00:29,930 trying to model the relationship between square feet of a house and 9 00:00:29,930 --> 00:00:32,930 the output, which was the value of the house. 10 00:00:34,360 --> 00:00:39,339 But as the name implies, this simple linear regression model is really simple. 11 00:00:39,339 --> 00:00:40,456 And in a lot of cases, 12 00:00:40,456 --> 00:00:44,640 we're gonna be interested in more complex functions of our input. 13 00:00:44,640 --> 00:00:47,780 So one example of this is something called polynomial regression. 14 00:00:47,780 --> 00:00:51,574 And we actually saw this back in the first course in the specialization. 15 00:00:51,574 --> 00:00:52,859 And in that case, 16 00:00:52,859 --> 00:00:58,453 what we did was we took our simple linear regression model and our fit of that. 17 00:00:58,453 --> 00:01:01,986 Of course at that time, when we were in the first course of the specialization, 18 00:01:01,986 --> 00:01:05,362 we didn't have all this terminology that we learned in the last module, but 19 00:01:05,362 --> 00:01:08,500 now we know that this is a simple linear regression model. 20 00:01:08,500 --> 00:01:12,530 And we take this fit and we show it to our friend and we say, hey, look, 21 00:01:12,530 --> 00:01:13,950 this is so cool. 22 00:01:13,950 --> 00:01:15,980 I have this line that I fit to my data, and 23 00:01:15,980 --> 00:01:18,660 now I can predict the value of my house. 24 00:01:18,660 --> 00:01:23,092 And your friend is a little bit skeptical and says, dude, 25 00:01:23,092 --> 00:01:28,980 it's not a linear relationship between square feet and the value of a house. 26 00:01:28,980 --> 00:01:31,810 He's looking at the data and saying he just doesn't believe it. 27 00:01:33,190 --> 00:01:35,863 Instead, he thinks it's a quadratic fit. 28 00:01:35,863 --> 00:01:39,940 So, what your friend is saying is that he doesn't believe the model that you use. 29 00:01:39,940 --> 00:01:44,601 He doesn't believe that it's just this linear relationship, of course plus error. 30 00:01:44,601 --> 00:01:48,318 He thinks that there's this quadratic function, which has the following 31 00:01:48,318 --> 00:01:52,850 equation, underlying the relationship between square feet and house value. 32 00:01:52,850 --> 00:01:53,540 And again, 33 00:01:53,540 --> 00:01:56,359 our regression model is gonna assume that there's some noise around that. 34 00:01:58,390 --> 00:02:01,430 But of course, you could consider even higher order polynomials. 35 00:02:01,430 --> 00:02:07,040 For example, here, I'm showing some pth order polynomial that you might choose to 36 00:02:07,040 --> 00:02:12,428 be your model of the relationship between square feet and the value of the house. 37 00:02:12,428 --> 00:02:16,583 So here's our generic polynomial regression model, 38 00:02:16,583 --> 00:02:22,721 where we take our observation, yi, and model it as this polynomial in terms of, 39 00:02:22,721 --> 00:02:27,888 for example, square feet of our house, which is just some input x. 40 00:02:27,888 --> 00:02:31,580 And then we assume that there's some error, epsilon i. 41 00:02:31,580 --> 00:02:36,104 So that's the error associated with the ith observation. 42 00:02:36,104 --> 00:02:40,021 And what we see is that in this model, in contrast to our simple linear regression 43 00:02:40,021 --> 00:02:43,550 model, we have all these powers of x that are now appearing in the model. 44 00:02:44,780 --> 00:02:50,220 And what we can do, is we can treat these different powers of x as features. 45 00:02:50,220 --> 00:02:52,213 Okay, so now we're introducing this new word features. 46 00:02:52,213 --> 00:02:56,940 And what features are, they're just some function of your input. 47 00:02:56,940 --> 00:03:01,610 So, in this case, in particular, our features, just to be very explicit, 48 00:03:01,610 --> 00:03:03,750 our first feature of the model is just the number 1, 49 00:03:03,750 --> 00:03:06,280 which is called the constant feature. 50 00:03:06,280 --> 00:03:10,270 Then the second feature of our model is x. 51 00:03:10,270 --> 00:03:14,820 So, that's just the linear term, just like we had in simple linear regression. 52 00:03:14,820 --> 00:03:17,640 And then our third feature is x squared. 53 00:03:17,640 --> 00:03:22,420 And we keep going up to our p+1 feature, which is x to the power p. 54 00:03:23,480 --> 00:03:27,840 And associated with each one of these features in our model is a parameter. 55 00:03:27,840 --> 00:03:33,850 So we have some p+1 parameters, w0, which is just the intercept term. 56 00:03:33,850 --> 00:03:41,392 All the way up to wp, the coefficient associated with pth power of our input. 57 00:03:41,392 --> 00:03:45,369 [MUSIC]