1 00:00:00,000 --> 00:00:04,362 [MUSIC] 2 00:00:04,362 --> 00:00:08,880 We talked about accuracy and errors that a classifier might make. 3 00:00:08,880 --> 00:00:11,100 But there are different kinds of errors. 4 00:00:11,100 --> 00:00:15,560 So this kind of errors are called types of mistakes. 5 00:00:15,560 --> 00:00:19,050 It's important to look at the types of mistakes a classifier might make. 6 00:00:19,050 --> 00:00:22,030 And one way to do that is through what's called a confusion matrix. 7 00:00:22,030 --> 00:00:24,150 So, let's take that in a little bit. 8 00:00:24,150 --> 00:00:27,490 So we're talking about the relationship between the true label and 9 00:00:27,490 --> 00:00:30,810 whatever classifier predicts, so the predicted label. 10 00:00:30,810 --> 00:00:35,970 So let's say that if the true label is positive, and we predict a positive value 11 00:00:35,970 --> 00:00:40,220 for that sentence, we call that a true positive because we got it right. 12 00:00:40,220 --> 00:00:43,580 Similarly if the true label is negative and 13 00:00:43,580 --> 00:00:47,140 we predict that negative we call that a true negative. 14 00:00:47,140 --> 00:00:48,550 That's good cuz we got that right. 15 00:00:50,090 --> 00:00:52,480 Now, there is two kinds of mistakes that I can make. 16 00:00:52,480 --> 00:00:55,620 So, for example, if the true label's positive, but 17 00:00:55,620 --> 00:00:59,340 we predicted as negative, we call that a false negative. 18 00:01:01,280 --> 00:01:04,220 We said it was negative, but that was false cuz it's positive. 19 00:01:04,220 --> 00:01:08,640 Similarly, if the true label is negative when we predicted as positive, 20 00:01:08,640 --> 00:01:11,300 we call that a false positive. 21 00:01:11,300 --> 00:01:14,070 It was negative, but we predicted it as positive. 22 00:01:14,070 --> 00:01:18,080 And false positives and false negatives can have different impacts 23 00:01:18,080 --> 00:01:20,910 on what can happen in practice with your classifier. 24 00:01:20,910 --> 00:01:23,140 So let's look at a couple practical examples of that. 25 00:01:24,260 --> 00:01:26,450 So let's look at two applications, and 26 00:01:26,450 --> 00:01:30,340 what the cost has of false positives versus false negatives. 27 00:01:30,340 --> 00:01:34,940 So, if you consider spam filtering, a false negative is 28 00:01:34,940 --> 00:01:40,160 an email that was spam but went into my folder it thought it was not spam. 29 00:01:40,160 --> 00:01:42,890 So that's just annoying I got another spam email in my inbox. 30 00:01:42,890 --> 00:01:46,240 Maybe it's bad but not super bad. 31 00:01:46,240 --> 00:01:48,900 However, if you look at a false positive 32 00:01:48,900 --> 00:01:53,490 that's an email that was not spam that got labeled as spam, went to my spam filter. 33 00:01:53,490 --> 00:01:55,920 I never saw it, I lost that email forever. 34 00:01:55,920 --> 00:01:58,040 That has a higher cost. 35 00:01:58,040 --> 00:02:00,186 Now we can also look at medical diagnosis or 36 00:02:00,186 --> 00:02:03,030 other applications as a second application. 37 00:02:03,030 --> 00:02:05,856 So what's a false negative in medical diagnosis? 38 00:02:05,856 --> 00:02:09,307 False negative is, there's a disease that I have but 39 00:02:09,307 --> 00:02:13,617 it didn't get detected, so the classifier said it was negative. 40 00:02:13,617 --> 00:02:14,999 They don't have the disease. 41 00:02:14,999 --> 00:02:17,573 So in this case, the disease goes untreated, 42 00:02:17,573 --> 00:02:19,360 which can be a really bad thing. 43 00:02:20,430 --> 00:02:22,992 But the false positives can also be a bad thing. 44 00:02:22,992 --> 00:02:28,500 That is, I classify as having the disease when I never had the disease. 45 00:02:28,500 --> 00:02:32,910 In this case I get treated potentially with a really bad drug or 46 00:02:32,910 --> 00:02:35,650 false side effect for diseases that I never had. 47 00:02:35,650 --> 00:02:39,678 So it's a little bit unclear what's worse, having a false positive or 48 00:02:39,678 --> 00:02:41,110 a false negative. 49 00:02:41,110 --> 00:02:45,180 In medialc complications it really depends on the cost of the treatment and 50 00:02:45,180 --> 00:02:49,640 how many side effects it had versus how bad the disease can be. 51 00:02:49,640 --> 00:02:53,910 Now this relationship between the true label and the predicted label, 52 00:02:53,910 --> 00:02:57,640 false positive, false negatives, is called the Confusion Matrix. 53 00:02:57,640 --> 00:02:59,400 This matrix we just do. 54 00:02:59,400 --> 00:03:06,571 So for example, let's say that we have a setting with a 100 test examples. 55 00:03:08,140 --> 00:03:14,377 And we have of those, 60 positive and 40 are negative. 56 00:03:14,377 --> 00:03:16,920 So there's a little bit of class imbalance but not too much. 57 00:03:17,920 --> 00:03:24,560 So of those 60 true positives, if I say I got 50 of them correct, 58 00:03:24,560 --> 00:03:29,140 well of the 42 negatives 59 00:03:30,140 --> 00:03:33,850 I got 35 of them correct. 60 00:03:33,850 --> 00:03:35,160 Let's see what we've learned. 61 00:03:35,160 --> 00:03:39,630 So out of the 100 examples I got 85 correct. 62 00:03:39,630 --> 00:03:41,100 So we can talk about our accuracy. 63 00:03:42,850 --> 00:03:50,124 Accuracy is 85 correct over 100, which is 0.85. 64 00:03:51,570 --> 00:03:55,719 And we can also discuss the true positives and the true negatives. 65 00:03:55,719 --> 00:03:59,376 Sorry, the false positives and the false negatives, so of the positives, 66 00:03:59,376 --> 00:04:02,014 I got labeled as negative, that's a false negative. 67 00:04:02,014 --> 00:04:08,449 And that was ten, I had ten false negatives and on the other hand, 68 00:04:08,449 --> 00:04:13,367 of the true negatives we get five false positive. 69 00:04:13,367 --> 00:04:17,185 So in this example, we got 85% accuracy. 70 00:04:17,185 --> 00:04:24,310 We got a higher false negative rate, than we had a false positive rate. 71 00:04:24,310 --> 00:04:25,700 Now those words, false positive, 72 00:04:25,700 --> 00:04:30,700 false negative, apply only for minor classification for two classes. 73 00:04:30,700 --> 00:04:35,060 But the ideal confusion matrix works well even when you have more classes. 74 00:04:35,060 --> 00:04:37,430 So let's talk about a simple example of that. 75 00:04:37,430 --> 00:04:41,080 So let's say that I have 100 test examples and this is for 76 00:04:41,080 --> 00:04:45,399 medical diagnosis, so there's 3 classes, healthy, cold or flu. 77 00:04:45,399 --> 00:04:51,983 And of the 100 test subjects we had 70 with that were healthy, 78 00:04:51,983 --> 00:04:57,380 20 that had cold, and 10 that had the flu. 79 00:04:57,380 --> 00:05:03,920 And let's suppose that we got 60 correct for 80 00:05:03,920 --> 00:05:09,621 healthy, we got 12 correct for cold, 81 00:05:09,621 --> 00:05:15,841 and we got 60, 12, 8 correct for flu. 82 00:05:15,841 --> 00:05:21,004 So, the total, our accuracy, here, 83 00:05:21,004 --> 00:05:28,994 was 80, which is 60 plus 12 plus 8 divided by 100. 84 00:05:28,994 --> 00:05:32,128 So that 0.8, 80% accuracy. 85 00:05:32,128 --> 00:05:35,910 But we can talk about the false predictions. 86 00:05:35,910 --> 00:05:38,480 So from healthy there were ten mistakes. 87 00:05:38,480 --> 00:05:42,330 And we can say it's more common to confuse healthy with having a cold 88 00:05:42,330 --> 00:05:46,370 than it is with having the flu, because the flu is a more complex disease so 89 00:05:46,370 --> 00:05:48,880 we might have those ten mistakes. 90 00:05:48,880 --> 00:05:53,450 Eight were confused with code and two were confused with flu. 91 00:05:53,450 --> 00:05:54,750 Cold can go both ways. 92 00:05:54,750 --> 00:05:56,180 So we made eight mistakes. 93 00:05:56,180 --> 00:05:59,230 Maybe you can say half of them got confused with healthy and 94 00:05:59,230 --> 00:06:01,930 half of them got diagnosis something stronger the flu. 95 00:06:03,100 --> 00:06:08,070 Well of the two mistakes for the flu, then maybe we say that we 96 00:06:08,070 --> 00:06:12,280 made no mistakes often, nobody that came in for flu was thought oh you're healthy. 97 00:06:12,280 --> 00:06:16,470 But two of those ten were thought to have just a cold and not the flu. 98 00:06:16,470 --> 00:06:19,200 So this is an example of a confusion matrix, we can really 99 00:06:19,200 --> 00:06:22,620 understand the types of mistakes we made and we can interpret those. 100 00:06:22,620 --> 00:06:24,890 And this is a really important thing to do in classification