1
00:00:00,028 --> 00:00:04,908
[MUSIC]

2
00:00:04,908 --> 00:00:09,331
This course address classification,
which is one of the most widely used,

3
00:00:09,331 --> 00:00:12,038
most fundamental areas
of machine learning.

4
00:00:12,038 --> 00:00:13,640
If you understand classifiers,

5
00:00:13,640 --> 00:00:17,646
you'll also understand basically the rest
of machine learning and the techniques

6
00:00:17,646 --> 00:00:20,980
we use here is what most people in
the industry need to be successful.

7
00:00:22,060 --> 00:00:25,710
We discuss how much learning
is about input data

8
00:00:25,710 --> 00:00:28,380
which can be pushed through
some machinery algorithm

9
00:00:28,380 --> 00:00:32,370
which outputs what we think about this
kind of intelligence from the data.

10
00:00:32,370 --> 00:00:34,240
In this course,
we're going to build classifiers, so

11
00:00:34,240 --> 00:00:39,205
classifier takes into input But
some x or some features of our data.

12
00:00:39,205 --> 00:00:43,535
And as output makes a prediction
which a discrete class or

13
00:00:43,535 --> 00:00:46,105
category or label for the data.

14
00:00:46,105 --> 00:00:49,425
And we're going to see a ton of different
examples of how this is used for

15
00:00:49,425 --> 00:00:50,585
in practice.

16
00:00:50,585 --> 00:00:56,545
The goal of a classifier is to learn a
mapping from the input x to the output y,

17
00:00:56,545 --> 00:00:58,210
those classes.

18
00:00:58,210 --> 00:01:01,990
The example that we discussed in the first
course was a sentiment classifier,

19
00:01:01,990 --> 00:01:06,960
where we're given that input sentence x,
like easily the best sushi in Seattle.

20
00:01:06,960 --> 00:01:10,050
We fed that through
the sentiment classifier,

21
00:01:10,050 --> 00:01:12,700
which then told us an output y.

22
00:01:12,700 --> 00:01:16,110
That was either,
yeah that is a positive sentence.

23
00:01:16,110 --> 00:01:18,560
Or nay, that is a negative sentence.

24
00:01:18,560 --> 00:01:22,900
And we can use these sentences,
these predictions in a wide range of ways,

25
00:01:22,900 --> 00:01:23,730
as we'll see soon.

26
00:01:24,760 --> 00:01:27,880
A general classifier about is about,
taking some input x,

27
00:01:27,880 --> 00:01:32,820
pushing it through some model,
which predicts why that might be, for

28
00:01:32,820 --> 00:01:37,630
example, two classes or multiple classes
of, say, positive or negative, or

29
00:01:37,630 --> 00:01:43,430
as we will see, it could be three,
four or more categories.

30
00:01:44,940 --> 00:01:47,530
Let's suppose for
example I have a web page, and

31
00:01:47,530 --> 00:01:50,300
I want to figure out what ads
to show on this web page.

32
00:01:50,300 --> 00:01:52,830
So I need to figure out what
this web page is about.

33
00:01:52,830 --> 00:01:56,932
The goal here is to take the text of the
web page and categorize it automatically

34
00:01:56,932 --> 00:02:01,076
as to whether it's kind of an educational
site and educational type ads.

35
00:02:01,076 --> 00:02:04,290
Whether it's kind of a site about
finance or an article about finance and

36
00:02:04,290 --> 00:02:06,190
we need that kind of ad.

37
00:02:06,190 --> 00:02:08,410
Or one about technology and so on.

38
00:02:08,410 --> 00:02:11,490
So a classification is not jut binary,
positive or negative, but

39
00:02:11,490 --> 00:02:14,350
it can be one of multiple categories,
or multiple classes.

40
00:02:15,620 --> 00:02:19,050
Perhaps the most common type of
classifier that we see everyday,

41
00:02:19,050 --> 00:02:22,530
every time we open up our
email is a famous spam filter.

42
00:02:22,530 --> 00:02:26,380
So the spam filter takes
every time an email arrives.

43
00:02:26,380 --> 00:02:31,440
Makes a prediction whether this is a spam
email, it should be ignored or not spam.

44
00:02:31,440 --> 00:02:34,920
And that prediction needs to be made based
not just on the text of the email but

45
00:02:34,920 --> 00:02:36,790
on other information we
get from that email.

46
00:02:36,790 --> 00:02:40,860
Like who that sender was what the IP
address of that sent message is.

47
00:02:40,860 --> 00:02:45,120
Other messages that the sender sent, and
so on, and, from that information, we're

48
00:02:45,120 --> 00:02:49,330
going to learn the mapping, from those
inputs, to whether it's spam, or not.

49
00:02:49,330 --> 00:02:52,000
And, those spam filters gotten so
much better over the years.

50
00:02:52,000 --> 00:02:55,760
So, I remember, early on,
we just used keyword search,

51
00:02:55,760 --> 00:02:57,950
or keyword classifiers, and
they weren't very good.

52
00:02:57,950 --> 00:03:01,080
But, today I don't even check
my spam folder anymore.

53
00:03:01,080 --> 00:03:04,930
So if you send me an email and didn't
open it, maybe it's in my spam folder.

54
00:03:04,930 --> 00:03:05,430
Sorry.

55
00:03:07,470 --> 00:03:09,090
We can build all sorts
of classifiers though.

56
00:03:09,090 --> 00:03:10,900
We can use, for example, image data.

57
00:03:10,900 --> 00:03:15,340
So given this particular input,
my dog, the image pixels.

58
00:03:15,340 --> 00:03:18,290
I want to make a prediction
from a certain category.

59
00:03:18,290 --> 00:03:22,180
So from this famous image net data set,

60
00:03:22,180 --> 00:03:24,640
where there's a thousand different
categories you might want to predict.

61
00:03:24,640 --> 00:03:25,200
So for example,

62
00:03:25,200 --> 00:03:28,850
if you want to know if it's a labrador
retriever, a golden retriever, and so on.

63
00:03:28,850 --> 00:03:32,100
What kind of dog it is, and that's
the output label y that we might want.

64
00:03:33,440 --> 00:03:36,750
Now the idea of classifiers
can be extremely useful for

65
00:03:36,750 --> 00:03:38,830
a wide range of domains.

66
00:03:38,830 --> 00:03:42,770
One that I'm particularly excited about
is the area of personalized medicine,

67
00:03:42,770 --> 00:03:45,100
which I think is going to
change the world.

68
00:03:45,100 --> 00:03:49,730
So today if I don't feel so well I might
put a thermometer under my arm and

69
00:03:49,730 --> 00:03:52,590
check out my temperature,
or a doctor might order

70
00:03:52,590 --> 00:03:57,760
an X-Ray to see what's going on in my
chest, or maybe use some lab tests.

71
00:03:57,760 --> 00:04:01,570
And based on that information,
goes through some crossfire,

72
00:04:01,570 --> 00:04:05,090
which maybe it's in the doctor's head or
maybe it's an automated system,

73
00:04:05,090 --> 00:04:09,840
that tries to make a prediction as
to what condition I might have.

74
00:04:09,840 --> 00:04:13,190
But what's annoying about
how medicine is done today

75
00:04:13,190 --> 00:04:16,960
is that based on the same conditions,
we make the same predictions for me or for

76
00:04:16,960 --> 00:04:20,310
you independent of the fact that
we're really different people.

77
00:04:20,310 --> 00:04:23,780
Personalized medicine aims
to totally change that.

78
00:04:23,780 --> 00:04:28,950
So it's going to look at our DNA sequences
because we're genetically different and

79
00:04:28,950 --> 00:04:31,550
find a good treatment for each one of us.

80
00:04:31,550 --> 00:04:35,320
And maybe even look at our lifestyle
which might say something about what I'm

81
00:04:35,320 --> 00:04:36,400
prone to.

82
00:04:36,400 --> 00:04:40,130
Maybe that's your lifestyle,
maybe my lifestyle is more like this.

83
00:04:40,130 --> 00:04:45,620
And so based on that kind of information,
we can predict what condition I have and

84
00:04:45,620 --> 00:04:48,500
what have treatment is going to
be the most effective for me.

85
00:04:48,500 --> 00:04:51,140
And that's an example of
classification in the real world.

86
00:04:52,210 --> 00:04:56,680
Perhaps one of the most fun and
surprising examples of classification

87
00:04:56,680 --> 00:05:01,110
is work that one of my colleagues
Tom Mitchell did, which is pretty amazing.

88
00:05:01,110 --> 00:05:06,220
You take a scan of your brain
as you look at a word, and

89
00:05:06,220 --> 00:05:08,880
based on that image from
what's called an FMRI,

90
00:05:09,900 --> 00:05:14,370
he can make a prediction as to
what kind of word you're reading.

91
00:05:14,370 --> 00:05:18,110
So for example, based on the image of your
brain, it can predict if you're reading,

92
00:05:18,110 --> 00:05:22,100
say, the word hammer or the word house.

93
00:05:22,100 --> 00:05:24,070
Which is basically reading your mind.

94
00:05:24,070 --> 00:05:28,960
And I've been talking to Tom for
a long time about this topic.

95
00:05:28,960 --> 00:05:30,820
More than ten, 15 years.

96
00:05:30,820 --> 00:05:33,930
And over that time,
the concept results they have,

97
00:05:33,930 --> 00:05:36,300
had evolved from very basic things.

98
00:05:36,300 --> 00:05:37,360
So amazing things.

99
00:05:37,360 --> 00:05:38,960
So, for example today,

100
00:05:38,960 --> 00:05:44,090
they can train a classifier on your brain
images based on words that you read, and

101
00:05:44,090 --> 00:05:49,330
then use it to predict something from my
brain images based on pictures that I see.

102
00:05:49,330 --> 00:05:52,060
So picture of a hammer instead
of the actual word hammer.

103
00:05:52,060 --> 00:05:55,590
And that is an incredible
kind of evolution,

104
00:05:55,590 --> 00:05:58,840
an incredible kind of analysis
that you can do from brain data.

105
00:05:58,840 --> 00:06:01,688
Really, really cool
example classification.

106
00:06:01,688 --> 00:06:03,948
Read your mind.

107
00:06:03,948 --> 00:06:07,939
[MUSIC]