1
00:00:00,000 --> 00:00:03,817
[MUSIC]

2
00:00:03,817 --> 00:00:07,875
So, we now built a whole new kind
of restaurant experience, or

3
00:00:07,875 --> 00:00:11,410
restaurant review experience,
using classifiers.

4
00:00:11,410 --> 00:00:14,620
Let's dig in and really understand
a little bit more what a classifier is,

5
00:00:14,620 --> 00:00:17,610
and some other applications
of classification.

6
00:00:17,610 --> 00:00:22,550
So a classifier takes some input x, for
example a sentence from the review or

7
00:00:22,550 --> 00:00:24,440
other inputs as we'll see.

8
00:00:24,440 --> 00:00:27,840
It pushes it through what's called
a model to output some value y,

9
00:00:27,840 --> 00:00:29,770
that we're trying to predict.

10
00:00:29,770 --> 00:00:34,570
And here it's a class, for
example, positive or negative.

11
00:00:34,570 --> 00:00:38,630
So positive in the case of sentiment
analysis corresponds with thumbs up

12
00:00:38,630 --> 00:00:42,000
reviews, while negative
corresponds with thumbs down.

13
00:00:42,000 --> 00:00:44,820
But this is just one
example of classification.

14
00:00:44,820 --> 00:00:47,620
You can look at text, for
example, a web page, I want

15
00:00:47,620 --> 00:00:52,230
to figure out which webpages interest me,
and I can align them to categories.

16
00:00:52,230 --> 00:00:55,850
For example, is this about education,
a page about education,

17
00:00:55,850 --> 00:00:59,770
is it a page about finance,
is it a page about technology and so on?

18
00:00:59,770 --> 00:01:01,550
So, there's not just two categories.

19
00:01:01,550 --> 00:01:03,040
There can be three, four, or

20
00:01:03,040 --> 00:01:05,200
even thousands of categories
I'm predicting from.

21
00:01:07,180 --> 00:01:11,150
Now, another example of classification,

22
00:01:11,150 --> 00:01:15,820
which really has impacted all of
our lives, is in spam filtering.

23
00:01:15,820 --> 00:01:21,900
Some of you might remember, perhaps in the
early 2000s, what spam filters were like.

24
00:01:21,900 --> 00:01:24,520
The quality was not very good.

25
00:01:24,520 --> 00:01:27,830
Really, they were all hand-tuned
things where somebody said,

26
00:01:27,830 --> 00:01:30,330
oh it has certain words, it must be spam.

27
00:01:30,330 --> 00:01:33,580
But the spammers kept changing
the words a little bit,

28
00:01:33,580 --> 00:01:37,760
adding numbers instead of letters,
and beating the spam filters.

29
00:01:37,760 --> 00:01:40,030
And what really changed
the world of spam filtering.

30
00:01:40,030 --> 00:01:44,880
It changed it so much that I don't even
look at my spam folder, actually, sorry if

31
00:01:44,880 --> 00:01:49,930
your message went to my spam folder,
I just don't open it, is machine learning.

32
00:01:49,930 --> 00:01:50,990
It's classifiers.

33
00:01:50,990 --> 00:01:53,030
They took the input x of the email and

34
00:01:53,030 --> 00:01:57,080
they fed it through a classifier that
predicted whether it's spam or not.

35
00:01:57,080 --> 00:01:58,760
It did that really well.

36
00:01:58,760 --> 00:02:01,480
And it does it by not just looking
at the text of the email, but

37
00:02:01,480 --> 00:02:02,820
it looks at other characteristics.

38
00:02:02,820 --> 00:02:05,240
So for example, who sent it.

39
00:02:05,240 --> 00:02:08,570
If it somebody's whose a close friend,
or somebody you have lots of vacation,

40
00:02:08,570 --> 00:02:10,320
with it's less likely to be spam.

41
00:02:10,320 --> 00:02:14,180
The IP address is a person sending
from the usual computer and so on.

42
00:02:14,180 --> 00:02:15,000
Lots of information.

43
00:02:16,480 --> 00:02:19,230
So this is another really
interesting practical application.

44
00:02:21,590 --> 00:02:24,460
In computer vision,
we do a lot of classification.

45
00:02:24,460 --> 00:02:27,200
We'll take an image, and
figure out what is in that image.

46
00:02:27,200 --> 00:02:30,860
For example, here,
the input x are the pixels of the image.

47
00:02:30,860 --> 00:02:32,030
We feed it to a classifier, and

48
00:02:32,030 --> 00:02:35,080
we're going to predict
things like is this a dog?

49
00:02:35,080 --> 00:02:39,090
In fact it's a Labrador Retriever, Golden
Retriever, or a different kind of dog.

50
00:02:39,090 --> 00:02:42,620
This is actually my dog,
Labrador Retriever.

51
00:02:43,650 --> 00:02:46,709
And as we will see later on
in the deep learning module,

52
00:02:46,709 --> 00:02:51,049
there's really interesting new ways of
doing this with very high accuracy.

53
00:02:53,044 --> 00:02:56,422
Now you can also use classification
in medical diagnosis systems, and

54
00:02:56,422 --> 00:02:58,960
in fact this is what your doctor does.

55
00:02:58,960 --> 00:03:02,570
They take your temperature, maybe they
look at your x-ray, they look at some

56
00:03:02,570 --> 00:03:07,720
medical tests and they try to make
a prediction about what's ailing somebody.

57
00:03:07,720 --> 00:03:10,940
Maybe they say nah, nah,
you're just healthy, or you have a cold,

58
00:03:10,940 --> 00:03:13,190
you have the flu,
maybe even you have pneumonia.

59
00:03:13,190 --> 00:03:15,853
That's a variable y
that's being predicted,

60
00:03:15,853 --> 00:03:18,730
what is the disease that's going on.

61
00:03:18,730 --> 00:03:24,030
Now these days, there's really interesting
new things around personalized medicine,

62
00:03:24,030 --> 00:03:27,660
because the prediction doesn't just
depend on the standard measurements, but

63
00:03:27,660 --> 00:03:29,710
can be really personalized for me.

64
00:03:29,710 --> 00:03:34,440
Can depend on my particular DNA
sequencing, which is pretty exciting and

65
00:03:34,440 --> 00:03:39,423
also my lifestyle,
which maybe look something like

66
00:03:40,530 --> 00:03:43,420
or maybe more realistic,
something like this.

67
00:03:45,580 --> 00:03:47,550
And so, given all these measurements,

68
00:03:47,550 --> 00:03:50,050
can make an even better
prediction at what's ailing me.

69
00:03:52,150 --> 00:03:54,870
Now this idea of classification
in off-machine learning,

70
00:03:54,870 --> 00:03:58,890
has really gone much further,
even to be able to read your mind.

71
00:04:00,260 --> 00:04:05,610
So when I was at Carnegie-Mellon, Tom
Mitchell, one of my friends and colleagues

72
00:04:05,610 --> 00:04:09,740
at his office next door, and he come to me
and says, we've done this amazing thing.

73
00:04:09,740 --> 00:04:12,970
Which would take an image of your
brain using a technology called FMRI,

74
00:04:12,970 --> 00:04:18,750
which is brain scan, and predict
when you're reading a word of text,

75
00:04:18,750 --> 00:04:22,580
whether you're reading the word hammer or
the word house.

76
00:04:22,580 --> 00:04:24,990
So it's really reading your mind.

77
00:04:24,990 --> 00:04:28,060
But in fact,
he went on to do many interesting things.

78
00:04:28,060 --> 00:04:32,810
For example, if you're looking at
a picture of a hammer or a house, but

79
00:04:32,810 --> 00:04:37,310
you train the classified on them you
reading words hammers and house,

80
00:04:37,310 --> 00:04:41,260
you're still able to read your mind and
figure out what picture you're looking at.

81
00:04:42,330 --> 00:04:43,221
So this is yet

82
00:04:43,221 --> 00:04:48,168
the next frontier of classification
to understand how the brain works.

83
00:04:48,168 --> 00:04:48,668
[MUSIC]