1 00:00:00,028 --> 00:00:04,908 [MUSIC] 2 00:00:04,908 --> 00:00:09,331 This course address classification, which is one of the most widely used, 3 00:00:09,331 --> 00:00:12,038 most fundamental areas of machine learning. 4 00:00:12,038 --> 00:00:13,640 If you understand classifiers, 5 00:00:13,640 --> 00:00:17,646 you'll also understand basically the rest of machine learning and the techniques 6 00:00:17,646 --> 00:00:20,980 we use here is what most people in the industry need to be successful. 7 00:00:22,060 --> 00:00:25,710 We discuss how much learning is about input data 8 00:00:25,710 --> 00:00:28,380 which can be pushed through some machinery algorithm 9 00:00:28,380 --> 00:00:32,370 which outputs what we think about this kind of intelligence from the data. 10 00:00:32,370 --> 00:00:34,240 In this course, we're going to build classifiers, so 11 00:00:34,240 --> 00:00:39,205 classifier takes into input But some x or some features of our data. 12 00:00:39,205 --> 00:00:43,535 And as output makes a prediction which a discrete class or 13 00:00:43,535 --> 00:00:46,105 category or label for the data. 14 00:00:46,105 --> 00:00:49,425 And we're going to see a ton of different examples of how this is used for 15 00:00:49,425 --> 00:00:50,585 in practice. 16 00:00:50,585 --> 00:00:56,545 The goal of a classifier is to learn a mapping from the input x to the output y, 17 00:00:56,545 --> 00:00:58,210 those classes. 18 00:00:58,210 --> 00:01:01,990 The example that we discussed in the first course was a sentiment classifier, 19 00:01:01,990 --> 00:01:06,960 where we're given that input sentence x, like easily the best sushi in Seattle. 20 00:01:06,960 --> 00:01:10,050 We fed that through the sentiment classifier, 21 00:01:10,050 --> 00:01:12,700 which then told us an output y. 22 00:01:12,700 --> 00:01:16,110 That was either, yeah that is a positive sentence. 23 00:01:16,110 --> 00:01:18,560 Or nay, that is a negative sentence. 24 00:01:18,560 --> 00:01:22,900 And we can use these sentences, these predictions in a wide range of ways, 25 00:01:22,900 --> 00:01:23,730 as we'll see soon. 26 00:01:24,760 --> 00:01:27,880 A general classifier about is about, taking some input x, 27 00:01:27,880 --> 00:01:32,820 pushing it through some model, which predicts why that might be, for 28 00:01:32,820 --> 00:01:37,630 example, two classes or multiple classes of, say, positive or negative, or 29 00:01:37,630 --> 00:01:43,430 as we will see, it could be three, four or more categories. 30 00:01:44,940 --> 00:01:47,530 Let's suppose for example I have a web page, and 31 00:01:47,530 --> 00:01:50,300 I want to figure out what ads to show on this web page. 32 00:01:50,300 --> 00:01:52,830 So I need to figure out what this web page is about. 33 00:01:52,830 --> 00:01:56,932 The goal here is to take the text of the web page and categorize it automatically 34 00:01:56,932 --> 00:02:01,076 as to whether it's kind of an educational site and educational type ads. 35 00:02:01,076 --> 00:02:04,290 Whether it's kind of a site about finance or an article about finance and 36 00:02:04,290 --> 00:02:06,190 we need that kind of ad. 37 00:02:06,190 --> 00:02:08,410 Or one about technology and so on. 38 00:02:08,410 --> 00:02:11,490 So a classification is not jut binary, positive or negative, but 39 00:02:11,490 --> 00:02:14,350 it can be one of multiple categories, or multiple classes. 40 00:02:15,620 --> 00:02:19,050 Perhaps the most common type of classifier that we see everyday, 41 00:02:19,050 --> 00:02:22,530 every time we open up our email is a famous spam filter. 42 00:02:22,530 --> 00:02:26,380 So the spam filter takes every time an email arrives. 43 00:02:26,380 --> 00:02:31,440 Makes a prediction whether this is a spam email, it should be ignored or not spam. 44 00:02:31,440 --> 00:02:34,920 And that prediction needs to be made based not just on the text of the email but 45 00:02:34,920 --> 00:02:36,790 on other information we get from that email. 46 00:02:36,790 --> 00:02:40,860 Like who that sender was what the IP address of that sent message is. 47 00:02:40,860 --> 00:02:45,120 Other messages that the sender sent, and so on, and, from that information, we're 48 00:02:45,120 --> 00:02:49,330 going to learn the mapping, from those inputs, to whether it's spam, or not. 49 00:02:49,330 --> 00:02:52,000 And, those spam filters gotten so much better over the years. 50 00:02:52,000 --> 00:02:55,760 So, I remember, early on, we just used keyword search, 51 00:02:55,760 --> 00:02:57,950 or keyword classifiers, and they weren't very good. 52 00:02:57,950 --> 00:03:01,080 But, today I don't even check my spam folder anymore. 53 00:03:01,080 --> 00:03:04,930 So if you send me an email and didn't open it, maybe it's in my spam folder. 54 00:03:04,930 --> 00:03:05,430 Sorry. 55 00:03:07,470 --> 00:03:09,090 We can build all sorts of classifiers though. 56 00:03:09,090 --> 00:03:10,900 We can use, for example, image data. 57 00:03:10,900 --> 00:03:15,340 So given this particular input, my dog, the image pixels. 58 00:03:15,340 --> 00:03:18,290 I want to make a prediction from a certain category. 59 00:03:18,290 --> 00:03:22,180 So from this famous image net data set, 60 00:03:22,180 --> 00:03:24,640 where there's a thousand different categories you might want to predict. 61 00:03:24,640 --> 00:03:25,200 So for example, 62 00:03:25,200 --> 00:03:28,850 if you want to know if it's a labrador retriever, a golden retriever, and so on. 63 00:03:28,850 --> 00:03:32,100 What kind of dog it is, and that's the output label y that we might want. 64 00:03:33,440 --> 00:03:36,750 Now the idea of classifiers can be extremely useful for 65 00:03:36,750 --> 00:03:38,830 a wide range of domains. 66 00:03:38,830 --> 00:03:42,770 One that I'm particularly excited about is the area of personalized medicine, 67 00:03:42,770 --> 00:03:45,100 which I think is going to change the world. 68 00:03:45,100 --> 00:03:49,730 So today if I don't feel so well I might put a thermometer under my arm and 69 00:03:49,730 --> 00:03:52,590 check out my temperature, or a doctor might order 70 00:03:52,590 --> 00:03:57,760 an X-Ray to see what's going on in my chest, or maybe use some lab tests. 71 00:03:57,760 --> 00:04:01,570 And based on that information, goes through some crossfire, 72 00:04:01,570 --> 00:04:05,090 which maybe it's in the doctor's head or maybe it's an automated system, 73 00:04:05,090 --> 00:04:09,840 that tries to make a prediction as to what condition I might have. 74 00:04:09,840 --> 00:04:13,190 But what's annoying about how medicine is done today 75 00:04:13,190 --> 00:04:16,960 is that based on the same conditions, we make the same predictions for me or for 76 00:04:16,960 --> 00:04:20,310 you independent of the fact that we're really different people. 77 00:04:20,310 --> 00:04:23,780 Personalized medicine aims to totally change that. 78 00:04:23,780 --> 00:04:28,950 So it's going to look at our DNA sequences because we're genetically different and 79 00:04:28,950 --> 00:04:31,550 find a good treatment for each one of us. 80 00:04:31,550 --> 00:04:35,320 And maybe even look at our lifestyle which might say something about what I'm 81 00:04:35,320 --> 00:04:36,400 prone to. 82 00:04:36,400 --> 00:04:40,130 Maybe that's your lifestyle, maybe my lifestyle is more like this. 83 00:04:40,130 --> 00:04:45,620 And so based on that kind of information, we can predict what condition I have and 84 00:04:45,620 --> 00:04:48,500 what have treatment is going to be the most effective for me. 85 00:04:48,500 --> 00:04:51,140 And that's an example of classification in the real world. 86 00:04:52,210 --> 00:04:56,680 Perhaps one of the most fun and surprising examples of classification 87 00:04:56,680 --> 00:05:01,110 is work that one of my colleagues Tom Mitchell did, which is pretty amazing. 88 00:05:01,110 --> 00:05:06,220 You take a scan of your brain as you look at a word, and 89 00:05:06,220 --> 00:05:08,880 based on that image from what's called an FMRI, 90 00:05:09,900 --> 00:05:14,370 he can make a prediction as to what kind of word you're reading. 91 00:05:14,370 --> 00:05:18,110 So for example, based on the image of your brain, it can predict if you're reading, 92 00:05:18,110 --> 00:05:22,100 say, the word hammer or the word house. 93 00:05:22,100 --> 00:05:24,070 Which is basically reading your mind. 94 00:05:24,070 --> 00:05:28,960 And I've been talking to Tom for a long time about this topic. 95 00:05:28,960 --> 00:05:30,820 More than ten, 15 years. 96 00:05:30,820 --> 00:05:33,930 And over that time, the concept results they have, 97 00:05:33,930 --> 00:05:36,300 had evolved from very basic things. 98 00:05:36,300 --> 00:05:37,360 So amazing things. 99 00:05:37,360 --> 00:05:38,960 So, for example today, 100 00:05:38,960 --> 00:05:44,090 they can train a classifier on your brain images based on words that you read, and 101 00:05:44,090 --> 00:05:49,330 then use it to predict something from my brain images based on pictures that I see. 102 00:05:49,330 --> 00:05:52,060 So picture of a hammer instead of the actual word hammer. 103 00:05:52,060 --> 00:05:55,590 And that is an incredible kind of evolution, 104 00:05:55,590 --> 00:05:58,840 an incredible kind of analysis that you can do from brain data. 105 00:05:58,840 --> 00:06:01,688 Really, really cool example classification. 106 00:06:01,688 --> 00:06:03,948 Read your mind. 107 00:06:03,948 --> 00:06:07,939 [MUSIC]