1 00:00:00,182 --> 00:00:04,352 [MUSIC] 2 00:00:04,352 --> 00:00:05,733 >> Very good. 3 00:00:05,733 --> 00:00:08,950 Now, we learn this abstract concept of 4 00:00:08,950 --> 00:00:12,863 squeezing the score into the interval 0, 1. 5 00:00:12,863 --> 00:00:15,330 And we call it generalized linear models. 6 00:00:15,330 --> 00:00:16,570 It's pretty abstract. 7 00:00:16,570 --> 00:00:19,960 Logistic regression is a specific case of that, 8 00:00:19,960 --> 00:00:23,720 where we use what's called the logistic function to squeeze minus infinity to plus 9 00:00:23,720 --> 00:00:28,620 infinity into the interval 0, 1 so we can predict probabilities for every class. 10 00:00:30,120 --> 00:00:32,655 For logistic regression, the link function we'll use, 11 00:00:32,655 --> 00:00:35,400 it's called the logistic function. 12 00:00:35,400 --> 00:00:37,888 Sometimes called sigmoid, sometimes called logit. 13 00:00:37,888 --> 00:00:44,624 And it's a slightly scary function over here which takes the score as input, 14 00:00:44,624 --> 00:00:49,370 and says that the score, 15 00:00:49,370 --> 00:00:55,640 sorry the output, of a sigmoid is 1 divided by 1 plus e to the minus Score. 16 00:00:55,640 --> 00:00:57,296 So it shows up over here. 17 00:00:57,296 --> 00:01:00,860 And it's e to the power of something. 18 00:01:00,860 --> 00:01:02,955 I learned about that function as an undergrad. 19 00:01:02,955 --> 00:01:04,790 I didn't think it was that interesting a function, 20 00:01:04,790 --> 00:01:06,420 but turns out to be extremely useful. 21 00:01:06,420 --> 00:01:08,960 And here we'll see an example how it's useful. 22 00:01:08,960 --> 00:01:14,213 So, at the bottom here I'm plotting the score, which can range from minus 23 00:01:14,213 --> 00:01:18,871 infinity to plus infinity. 24 00:01:18,871 --> 00:01:22,680 And let's see what happens when you take that score and 25 00:01:22,680 --> 00:01:23,850 push it through the sigmoid. 26 00:01:24,850 --> 00:01:31,180 So, for example, if you take the score at zero, it actually hits 0.5. 27 00:01:31,180 --> 00:01:34,910 Which is cool because this is exactly what we're hoping for. 28 00:01:34,910 --> 00:01:36,850 The score is 0, the probability should be 0.5. 29 00:01:36,850 --> 00:01:39,870 So, actually let's do that explicitly. 30 00:01:39,870 --> 00:01:48,369 So, if I compute the sigmoid, The sigmoid of 0, 31 00:01:48,369 --> 00:01:54,950 that is 1 divided by 1 plus e to the 0. 32 00:01:54,950 --> 00:02:00,660 And as a little cheat sheet here at the bottom, e to the 0 is exactly 1. 33 00:02:00,660 --> 00:02:01,660 So, that's good. 34 00:02:01,660 --> 00:02:08,840 So, this is 1 divided by 1 plus 1 which is equal to 0.5. 35 00:02:08,840 --> 00:02:10,070 QED. 36 00:02:10,070 --> 00:02:15,550 So, now we have that if it score of 0's input, you get output of 0.5. 37 00:02:15,550 --> 00:02:17,590 That's super exciting. 38 00:02:17,590 --> 00:02:20,000 So, let's look at the positive end of the spectrum. 39 00:02:20,000 --> 00:02:24,535 You see that the curve keeps going up and up and up, and eventually it hits 1. 40 00:02:26,700 --> 00:02:32,180 So, the score of plus infinity, which is somewhere out here, 41 00:02:32,180 --> 00:02:36,110 turns out to be, sorry, the sigmoid positivity, turns out to be 1. 42 00:02:36,110 --> 00:02:38,520 Which, again, what we wanted. 43 00:02:38,520 --> 00:02:41,380 If the score's plus infinity we want for probability 1. 44 00:02:41,380 --> 00:02:43,040 So, let's actually do that. 45 00:02:43,040 --> 00:02:45,820 So, let's see what happens sigmoid of plus infinity. 46 00:02:46,960 --> 00:02:54,830 That's 1 over 1 plus e to the power of minus infinity. 47 00:02:54,830 --> 00:02:58,106 And cheat sheet here, 48 00:02:58,106 --> 00:03:03,632 e to the minus infinity is equal to 0. 49 00:03:03,632 --> 00:03:08,230 And so, this thing here directly gets you the output 1. 50 00:03:08,230 --> 00:03:10,930 Okay, let's go to the other extreme. 51 00:03:10,930 --> 00:03:14,320 Each of them, lets look at minus infinity. 52 00:03:14,320 --> 00:03:18,390 So, if the score is minus infinity, as you can see down here, 53 00:03:18,390 --> 00:03:21,830 it looks like you hit 0 there, and that's exactly what you want. 54 00:03:21,830 --> 00:03:23,810 If the score is very negative, 55 00:03:23,810 --> 00:03:28,120 then you want the probability that y equals plus 1 to be 0. 56 00:03:28,120 --> 00:03:29,960 And we can plug it into here. 57 00:03:29,960 --> 00:03:36,400 1 divided by 1 plus e to the minus minus infinity is e to the infinity. 58 00:03:36,400 --> 00:03:41,920 In the cheat sheet down here, e to the infinity is infinity. 59 00:03:41,920 --> 00:03:46,110 And so, this is equal to 1 over 1 plus infinity. 60 00:03:46,110 --> 00:03:51,730 1 over 1 plus infinity is 0. 61 00:03:51,730 --> 00:03:53,460 Exactly what we'd want. 62 00:03:53,460 --> 00:03:57,950 So, the sigmoid has this property that it goes from 63 00:03:57,950 --> 00:04:02,900 0 to 0.5 to 1 really in the way we want. 64 00:04:02,900 --> 00:04:05,800 Now, what really is important here is the places in between. 65 00:04:07,050 --> 00:04:11,608 So, for example, if the score is 2, 66 00:04:11,608 --> 00:04:17,744 we'll see that we'll hit the 0.88 over here. 67 00:04:17,744 --> 00:04:20,472 And if the score were minus 2, 68 00:04:23,430 --> 00:04:27,168 [COUGH] We have 0.12. 69 00:04:27,168 --> 00:04:32,290 It's a symmetric function that ranges from 0 to 1. 70 00:04:32,290 --> 00:04:38,335 And so, it provides exactly the mapping from minus 71 00:04:38,335 --> 00:04:43,825 infinity to infinity to the interval 0, 1. 72 00:04:43,825 --> 00:04:48,509 [MUSIC]