1
00:00:01,093 --> 00:00:04,133
[MUSIC]

2
00:00:04,133 --> 00:00:08,337
In the regression model, we talked
about predicting house prices and

3
00:00:08,337 --> 00:00:10,689
fitting a regression model for that and

4
00:00:10,689 --> 00:00:14,070
we measured the error in
terms of sum-squared errors.

5
00:00:14,070 --> 00:00:18,023
Here in classification, our errors
are a little different because we

6
00:00:18,023 --> 00:00:23,100
are talking about what inputs we get
correct and which inputs we get wrong.

7
00:00:23,100 --> 00:00:26,360
So let's talk a little bit about
measuring error in classification.

8
00:00:26,360 --> 00:00:30,320
So when I learn a classifier,
I'm given a set of input data.

9
00:00:30,320 --> 00:00:33,510
So these are sentences that have been
marked to say positive or negative

10
00:00:33,510 --> 00:00:38,140
sentiment, and as in regression, we split
it into a training set and a test set.

11
00:00:39,630 --> 00:00:43,390
I feed the training set to
the classifier I'm trying to learn and

12
00:00:43,390 --> 00:00:46,560
that algorithm is actually going
to learn the weights for words.

13
00:00:46,560 --> 00:00:48,679
So for example it's going to
learn that good has a weight 1.0.

14
00:00:48,679 --> 00:00:51,341
Awesome, 1.7.

15
00:00:51,341 --> 00:00:52,360
Bad, -1.0.

16
00:00:52,360 --> 00:00:54,844
And awful, -3.3.

17
00:00:56,100 --> 00:01:00,230
And then, these weights are going to
be used to score every element in

18
00:01:00,230 --> 00:01:05,190
the test set and evaluate how good
we're doing in terms of classification.

19
00:01:05,190 --> 00:01:07,930
So lets talk about what
that evaluation looks like.

20
00:01:07,930 --> 00:01:10,920
Let's discuss how we measure error,
in fact,

21
00:01:10,920 --> 00:01:13,919
classification error,
when we're doing this classification.

22
00:01:14,962 --> 00:01:17,820
So we're getting a set of
test examples of the form,

23
00:01:17,820 --> 00:01:22,750
sushi was great, is a positive sentence,
and we're trying to figure out how many

24
00:01:22,750 --> 00:01:27,490
of these test sentences we get correct and
how many do we get make mistakes on.

25
00:01:27,490 --> 00:01:31,740
So what we are going to do is take
the sentence sushi is great and

26
00:01:31,740 --> 00:01:34,620
feed it through the classifier,
through the learned classifier.

27
00:01:34,620 --> 00:01:38,530
But we don't want the learned classifier
to actually see the true label.

28
00:01:38,530 --> 00:01:40,670
We're gonna see if it gets
the true label right.

29
00:01:40,670 --> 00:01:42,440
So we're gonna hide that true label.

30
00:01:42,440 --> 00:01:45,960
So the sentence gets fed to the learned
classifier while the true label is hidden.

31
00:01:47,190 --> 00:01:51,500
And now given the sentence, we're
gonna predict y hat as being positive.

32
00:01:51,500 --> 00:01:56,260
So we leave this as a positive sentence
and so, we've made a correct prediction.

33
00:01:56,260 --> 00:01:59,160
So the number of correct
sentences goes up by one.

34
00:01:59,160 --> 00:02:03,100
Now let's take another sentence,
another test example.

35
00:02:03,100 --> 00:02:08,290
So let's say you say the food
was okay as a negative sentence.

36
00:02:08,290 --> 00:02:11,960
So that's a bit of
a ambiguous sentence but

37
00:02:11,960 --> 00:02:14,950
it's been labeled as negative
in the training set.

38
00:02:14,950 --> 00:02:19,445
So now I feed the sentence to
the classifier, I hide the label.

39
00:02:19,445 --> 00:02:21,420
And let's see what the classifier does.

40
00:02:21,420 --> 00:02:24,699
In this case, cuz the food was
okay can be revealed as positive,

41
00:02:24,699 --> 00:02:28,587
maybe it makes a prediction that this
is positive sentence I made a mistake,

42
00:02:28,587 --> 00:02:30,620
cuz the true label is negative.

43
00:02:30,620 --> 00:02:32,769
So say hey, mistake was made.

44
00:02:33,900 --> 00:02:36,100
We now have one more mistake.

45
00:02:36,100 --> 00:02:39,500
So we have one correct classification and
one mistake.

46
00:02:39,500 --> 00:02:43,190
Now, we do this for
every sentence in the corpus.

47
00:02:43,190 --> 00:02:48,040
There are two common measures
of quality in classification.

48
00:02:48,040 --> 00:02:51,690
So for example,
one of them is the notion of error.

49
00:02:51,690 --> 00:02:56,980
So error measures, the fraction of the
test examples that we make mistakes on.

50
00:02:56,980 --> 00:03:02,420
So what we just do is say, out of all
of the sentences that are classified,

51
00:03:02,420 --> 00:03:06,690
how many mistakes there are made,
so number of mistakes, and

52
00:03:06,690 --> 00:03:12,520
I divide that by the total
number of test sentences.

53
00:03:12,520 --> 00:03:15,967
So for example if there
were 100 test sentences and

54
00:03:15,967 --> 00:03:19,880
I made ten mistakes then our
error would be 0.1 or 10%.

55
00:03:19,880 --> 00:03:23,973
Now the best possible error that
I can make is zero basically,

56
00:03:23,973 --> 00:03:25,320
I make no mistakes.

57
00:03:26,440 --> 00:03:29,400
Now, it's common to instead
of talk about error,

58
00:03:29,400 --> 00:03:32,340
to also talk about accuracy
of your classifier.

59
00:03:32,340 --> 00:03:34,940
So accuracy is exactly opposite of that.

60
00:03:34,940 --> 00:03:38,750
So, in accuracy,
instead of measuring the number of errors,

61
00:03:38,750 --> 00:03:41,790
we measure the number of
correct classifications.

62
00:03:41,790 --> 00:03:47,787
So the ratio here is number

63
00:03:47,787 --> 00:03:54,056
of correct divided by total

64
00:03:54,056 --> 00:03:59,250
number of sentences.

65
00:03:59,250 --> 00:04:03,250
And unlike error where the best possible
value is zero, in terms of accuracy,

66
00:04:03,250 --> 00:04:06,380
the best possible value is 1,
I've got all the sentences right.

67
00:04:07,950 --> 00:04:11,470
And in fact there's a really natural
relationship between the two.

68
00:04:11,470 --> 00:04:16,500
We know that error = 1-

69
00:04:16,500 --> 00:04:22,256
accuracy, and vise versa.

70
00:04:22,256 --> 00:04:22,756
[MUSIC]