1 00:00:00,000 --> 00:00:04,547 [MUSIC] 2 00:00:04,547 --> 00:00:09,827 AdaBoost uses this slightly intimidating formula to figure out what w hat t should 3 00:00:09,827 --> 00:00:15,020 be, but it has to be pretty intuitive if you look at it in a bit more detail. 4 00:00:15,020 --> 00:00:19,320 This formula is derived from a famous theorem, AdaBoost theorem, 5 00:00:19,320 --> 00:00:22,280 which I want to mention very briefly towards the end of the module. 6 00:00:22,280 --> 00:00:26,867 But it's the formula that lets you find classifiers that keep getting better and 7 00:00:26,867 --> 00:00:30,253 better, and help boosting to get to the optimal solution. 8 00:00:30,253 --> 00:00:35,081 So, lets look at this one in a little bit more detail by exploring a few possible 9 00:00:35,081 --> 00:00:35,603 cases. 10 00:00:35,603 --> 00:00:38,180 So, the question is is ft good? 11 00:00:38,180 --> 00:00:39,737 If ft is really good, 12 00:00:39,737 --> 00:00:45,110 it has really low error with the training data that say, weighted error. 13 00:00:45,110 --> 00:00:51,297 So for example, if that weighted error is 0.01, then it's a really good classifier. 14 00:00:51,297 --> 00:00:55,943 The question is first, lets see what happens to this famous middle term here 15 00:00:55,943 --> 00:00:58,280 when the weighted error is 0.01. 16 00:00:58,280 --> 00:01:05,270 So the middle term is 1-0.01/0.01, 17 00:01:05,270 --> 00:01:08,682 which is equal to 99. 18 00:01:08,682 --> 00:01:11,881 And next, to complete w hat t, 19 00:01:11,881 --> 00:01:17,921 we're going to take a half times the log of this number 99. 20 00:01:17,921 --> 00:01:22,645 And so if you do one-half times the log of 99, 21 00:01:22,645 --> 00:01:25,611 you're going to get 2.3. 22 00:01:25,611 --> 00:01:32,889 So this was an excellent classifier and we gave it a weight of 2.3, which is high. 23 00:01:32,889 --> 00:01:36,830 Now, let's see what happens if we output a random classifier. 24 00:01:36,830 --> 00:01:41,471 So as we said, a random classifier has weighted error of 25 00:01:41,471 --> 00:01:44,747 0.5 is not something to be trusted. 26 00:01:44,747 --> 00:01:50,842 So if you plug this in, 1-0.5/0.5, 27 00:01:50,842 --> 00:01:54,604 yields the magic number 1. 28 00:01:54,604 --> 00:01:59,475 And if you look at a half of log of one, what's log of one? 29 00:01:59,475 --> 00:02:03,119 It's 0, so w hat t is 0. 30 00:02:03,119 --> 00:02:05,849 So what we learn, if a classifier is just random, 31 00:02:05,849 --> 00:02:07,994 it's not doing anything meaningful. 32 00:02:07,994 --> 00:02:09,491 We count it by zero. 33 00:02:09,491 --> 00:02:12,935 We say, you're terrible, we're going to ignore you and 34 00:02:12,935 --> 00:02:16,340 you might have friends who are kind of like this. 35 00:02:16,340 --> 00:02:17,670 They say random stuff, 36 00:02:17,670 --> 00:02:21,110 you never trust what they say, you put zero weight on their opinions. 37 00:02:21,110 --> 00:02:22,860 So, that's what AdaBoost is too. 38 00:02:24,160 --> 00:02:27,843 Now we're going to get to a really, really, really interesting case. 39 00:02:27,843 --> 00:02:35,247 Let's suppose that your classify is terrible, it gets 0.99 weighted error. 40 00:02:35,247 --> 00:02:40,181 So its getting almost everything wrong, it's worse than random. 41 00:02:40,181 --> 00:02:45,327 Let's see what happens to the term in the middle here of our equation. 42 00:02:45,327 --> 00:02:50,097 You get 1-0.99/0.99, 43 00:02:50,097 --> 00:02:54,072 which is equal to 0.01 and 44 00:02:54,072 --> 00:03:01,547 guess what happens when you take a half log of 0.01? 45 00:03:01,547 --> 00:03:05,106 You get -2.3. 46 00:03:05,106 --> 00:03:10,537 And when I first saw this, I thought, wow, this AdaBoost theorem is beautiful, 47 00:03:10,537 --> 00:03:14,491 but take a moment to kind of internalize what just happened. 48 00:03:14,491 --> 00:03:18,317 We had this terrible classifier. 49 00:03:18,317 --> 00:03:22,343 But yet, we gave it pretty high weight, 2.3, but 50 00:03:22,343 --> 00:03:25,128 with a negative sign and why is that? 51 00:03:25,128 --> 00:03:33,114 Because a terrible, terrible classifier might be terrible but if we take 1-f of t. 52 00:03:33,114 --> 00:03:38,957 So if we do exactly the opposite of what it says, it's an awesome classifier. 53 00:03:40,894 --> 00:03:45,158 In other words, if we invert a classifier, we're going to do awesomely. 54 00:03:45,158 --> 00:03:47,595 And AdaBoost automatically does that for you. 55 00:03:47,595 --> 00:03:50,061 And so this is again, kind of the using the friend analogy. 56 00:03:50,061 --> 00:03:53,784 You might have a friend who always has really good opinions, 57 00:03:53,784 --> 00:03:56,051 but they're all always like wrong. 58 00:03:56,051 --> 00:03:59,325 And so, we do exactly the opposite of what that person says. 59 00:03:59,325 --> 00:04:02,980 Maybe this is how you hear your parents or something, or some friends. 60 00:04:02,980 --> 00:04:06,931 You say, okay, you say, I should do A, I'm going to do the opposite of that. 61 00:04:06,931 --> 00:04:09,434 And by doing that, I might do great things in the world. 62 00:04:09,434 --> 00:04:14,123 And so AdaBoost automatically figures that out for you, which is awesome. 63 00:04:14,123 --> 00:04:18,332 Now lets revisit the AdaBoost algorithm that we've been talking about and 64 00:04:18,332 --> 00:04:19,866 in this part of the module, 65 00:04:19,866 --> 00:04:23,870 we're going to be exploring how do we compute the coefficient w hat t and 66 00:04:23,870 --> 00:04:27,169 we saw that can be computed by this really simple formula. 67 00:04:27,169 --> 00:04:32,064 We compute the weighted error of f of t and we just say, w hat t is one-half 68 00:04:32,064 --> 00:04:36,659 of the log of 1 minus weighted error divided by the weighted error. 69 00:04:36,659 --> 00:04:38,961 And with that, we have a w hat t and 70 00:04:38,961 --> 00:04:42,844 we can focus on figuring out how to come up with alpha Is. 71 00:04:42,844 --> 00:04:50,509 And we want alpha i is to be high where ft makes mistakes or does [INAUDIBLE]. 72 00:04:50,509 --> 00:04:53,549 [MUSIC]