1 00:00:00,000 --> 00:00:04,530 [MUSIC] 2 00:00:04,530 --> 00:00:08,638 The notation that we've used so far is without features associated with it, 3 00:00:08,638 --> 00:00:12,367 but just like the regression course we're going to focus on introducing 4 00:00:12,367 --> 00:00:14,840 features from the very beginning. 5 00:00:14,840 --> 00:00:18,550 So we're going to have these functions h1 through h capital d. 6 00:00:18,550 --> 00:00:21,950 They've defined some features we might extract from the data. 7 00:00:21,950 --> 00:00:24,430 and we are going to encode the constant function that is h0. 8 00:00:25,810 --> 00:00:30,765 So in particular we are going to have to discard is w0, 9 00:00:30,765 --> 00:00:36,039 h0, +w1, h1, w2, h2, all the way to w capital d, 10 00:00:36,039 --> 00:00:39,593 hd so a feature could be a constant h0, 11 00:00:39,593 --> 00:00:44,803 it could be #awesome's for h1, #aweful's for h2. 12 00:00:44,803 --> 00:00:50,830 It could be some translations like log of number of awesome's times number of bad's. 13 00:00:50,830 --> 00:00:55,775 Or more realistically, it could be the TFIDF of number 14 00:00:55,775 --> 00:01:01,210 awful's which help us emphasize more words which are more distinctive, or 15 00:01:01,210 --> 00:01:05,930 important that we looked at PFIDF in the first course, and explored it quite a bit. 16 00:01:05,930 --> 00:01:09,260 And we're going to revisit that in the next course. 17 00:01:10,670 --> 00:01:18,480 So now we have this prediction score which is based on the sum over the features 18 00:01:19,830 --> 00:01:25,260 of the coefficient wi, wj times the feature hj. 19 00:01:25,260 --> 00:01:31,400 And we're going to use short hand a w transposed hi(xi) to denote the scores. 20 00:01:31,400 --> 00:01:32,938 So you see me do that a lot. 21 00:01:32,938 --> 00:01:37,490 W transpose h(xi) denotes the score for 22 00:01:37,490 --> 00:01:40,980 that particular data point and if that score is greater than 0 we're going to say 23 00:01:40,980 --> 00:01:43,990 positive and if the score is less than 0, we're going to say it's negative. 24 00:01:45,450 --> 00:01:49,600 Very good, so now we introduce our model in a little bit more detail. 25 00:01:49,600 --> 00:01:52,606 And I'm always going to take the input data x, 26 00:01:52,606 --> 00:01:56,100 feed it through the feature generating function, 27 00:01:56,100 --> 00:02:01,316 which might be counting the number of awesomes or creating a TF-IDF model. 28 00:02:01,316 --> 00:02:05,124 I'm going to feed that to the machinery model which is going to multiply 29 00:02:05,124 --> 00:02:09,748 the features with the learned weights that you had an output of value a score which 30 00:02:09,748 --> 00:02:14,032 we're going to throw through the same function an output item that Y had was 31 00:02:14,032 --> 00:02:17,522 plus one positive reviews of minus one, negative reviews. 32 00:02:17,522 --> 00:02:21,667 [MUSIC]