1
00:00:00,000 --> 00:00:04,493
[MUSIC]

2
00:00:04,493 --> 00:00:08,749
Our next step is to evaluate

3
00:00:08,749 --> 00:00:15,530
the sentiment model that we've built.

4
00:00:15,530 --> 00:00:21,030
Now, we just talked in this module
about classification error,

5
00:00:21,030 --> 00:00:24,298
precision, sorry,
false positives and false negatives.

6
00:00:24,298 --> 00:00:27,170
And we're gonna explore
that idea right here

7
00:00:27,170 --> 00:00:29,254
in this demonstration in this notebook.

8
00:00:29,254 --> 00:00:36,190
So the sentiment model has
a function called Evaluate.

9
00:00:36,190 --> 00:00:42,140
And that function Evaluate allows you to
evaluate its quality on some test data.

10
00:00:42,140 --> 00:00:46,108
And we're gonna provide
a particular metric and

11
00:00:46,108 --> 00:00:49,710
this metric is called the roc_curve.

12
00:00:49,710 --> 00:00:53,098
And we're gonna learn a little bit
more about the roc_curve next.

13
00:00:53,098 --> 00:00:56,960
But the roc_curve is a way to
explore the false positives and

14
00:00:56,960 --> 00:01:01,840
false negatives in that confusion
matrix that we discussed.

15
00:01:01,840 --> 00:01:06,750
Now, here is, it shows you the results of
evaluations that are hard to see in text.

16
00:01:06,750 --> 00:01:10,090
So let's do a little visualization of it,
again, using Canvas.

17
00:01:10,090 --> 00:01:14,189
So we can use sentiment_model.show, and

18
00:01:14,189 --> 00:01:18,841
we're gonna show the view
that we're gonna use,

19
00:01:18,841 --> 00:01:22,180
is going to be the Evaluation view.

20
00:01:23,990 --> 00:01:25,000
And here we go.

21
00:01:27,390 --> 00:01:30,500
Okay, this is really cool.

22
00:01:30,500 --> 00:01:32,502
We've built a few things.

23
00:01:32,502 --> 00:01:39,762
So this is a precision recall, I'm sorry,
this is what is called an roc_curve,

24
00:01:39,762 --> 00:01:46,180
it's a curve that trades off false
positives with true positives.

25
00:01:46,180 --> 00:01:47,890
Let me explain that a little bit.

26
00:01:47,890 --> 00:01:50,300
But first,
let's look at these numbers over here.

27
00:01:50,300 --> 00:01:52,960
So this is the confusion matrix.

28
00:01:52,960 --> 00:01:54,330
The number of true positives,

29
00:01:54,330 --> 00:01:59,950
the things that we got right
that were true, is 26,455.

30
00:01:59,950 --> 00:02:01,090
So right here.

31
00:02:01,090 --> 00:02:05,995
The number of true negatives
was only 4,000, so 3,965.

32
00:02:05,995 --> 00:02:09,500
So this is a highly unbalanced case.

33
00:02:09,500 --> 00:02:11,520
And then,
the number of false positives and

34
00:02:11,520 --> 00:02:13,370
the number of false
negatives is about the same.

35
00:02:13,370 --> 00:02:20,640
The overall accuracy was 91.1%, so 0.911.

36
00:02:20,640 --> 00:02:24,151
And also discuss a few metrics like
precision recall and false count

37
00:02:24,151 --> 00:02:29,410
which we're gonna learn more about later
in the course and later in specialization.

38
00:02:29,410 --> 00:02:34,666
Now, the thing about the false
positives and false negatives is that

39
00:02:34,666 --> 00:02:39,920
there's a very natural way to change
the threshold of what we believe

40
00:02:39,920 --> 00:02:44,918
the transition from negative class
to positive class should be.

41
00:02:44,918 --> 00:02:48,306
So this is what the threshold
shows on the right, but

42
00:02:48,306 --> 00:02:51,580
let me first show it on the curve here.

43
00:02:51,580 --> 00:02:54,300
So, you can see it is possible for
me to get, for

44
00:02:54,300 --> 00:02:58,285
example, a very high true positive rate.

45
00:02:58,285 --> 00:03:01,646
So .4, if I, sorry, a very low .4.

46
00:03:01,646 --> 00:03:06,760
If I dont' allow myself to
get any false positives.

47
00:03:06,760 --> 00:03:10,500
So, if I'm very worried about false
positives that can be very conservative,

48
00:03:10,500 --> 00:03:11,930
get no false positives.

49
00:03:11,930 --> 00:03:16,790
But again not capture many of the true
positives and make other mistakes.

50
00:03:16,790 --> 00:03:17,610
And you see.

51
00:03:17,610 --> 00:03:19,580
As you go through the curve
there's a kink here.

52
00:03:19,580 --> 00:03:23,930
This is the point where you get
not too many false positives but

53
00:03:23,930 --> 00:03:25,780
a lot of true positives.

54
00:03:25,780 --> 00:03:31,650
And then over on this end
you get every true positive.

55
00:03:31,650 --> 00:03:33,520
But you get a lot of false positives.

56
00:03:33,520 --> 00:03:36,330
How do you get every true positive but
get less false positives?

57
00:03:36,330 --> 00:03:38,870
You just say every data point is positive.

58
00:03:38,870 --> 00:03:41,910
Then you get all the true positives,
but you also make a lot of mistakes.

59
00:03:41,910 --> 00:03:45,170
And, we'll discuss more of this when
we talk about precision recall curve.

60
00:03:45,170 --> 00:03:49,390
But, nicely, we can change the threshold
over here, and walk that curve up there.

61
00:03:50,420 --> 00:03:55,030
As you can see, and so for example,
the far end, on the right,

62
00:03:55,030 --> 00:03:57,860
you see that the true
positive rate is really high.

63
00:03:57,860 --> 00:04:02,160
There's no false positives, but
that there's no true negatives.

64
00:04:02,160 --> 00:04:06,120
So there's no false negatives, no true
negatives, you just got everything right.

65
00:04:06,120 --> 00:04:10,210
And then you can slide
it just the other way.

66
00:04:10,210 --> 00:04:11,980
So you can play with this and

67
00:04:11,980 --> 00:04:15,350
really kind of get a good sense of what
this curve really means, the roc_curve.

68
00:04:15,350 --> 00:04:17,358
And that's the evaluation
that we're gonna be doing.

69
00:04:20,522 --> 00:04:24,569
[MUSIC]