1
00:00:00,000 --> 00:00:03,766
[MUSIC]

2
00:00:03,766 --> 00:00:08,734
Now we've seen classification in a wide
variety of settings and how we can really

3
00:00:08,734 --> 00:00:14,120
use to predict a class like a positive or
negative sentiment from data.

4
00:00:14,120 --> 00:00:18,100
In the regression section,
Amy talked about this block diagram that

5
00:00:18,100 --> 00:00:22,750
really describes how a machine learning
algorithm iterates through its data.

6
00:00:22,750 --> 00:00:27,470
So now let’s take this same block
diagram and work through it and

7
00:00:27,470 --> 00:00:32,810
describe how it works out in the case of
classification with sentiment analysis.

8
00:00:32,810 --> 00:00:38,426
So how does it look for classification for

9
00:00:38,426 --> 00:00:42,438
sentiment, so in this case,

10
00:00:42,438 --> 00:00:47,572
the data is the text of the reviews, so

11
00:00:47,572 --> 00:00:52,707
for each review, the text of review is

12
00:00:52,707 --> 00:00:59,299
associated with a particular
labeled sentiment

13
00:01:02,773 --> 00:01:07,598
From that text of the review,
we feed it to through a feature

14
00:01:07,598 --> 00:01:13,290
extraction phase which gives us x,
the inputs to our algorithm.

15
00:01:13,290 --> 00:01:16,800
And this x here is going
to be the word counts.

16
00:01:18,110 --> 00:01:23,220
So word counts for
every data point, for every review.

17
00:01:24,230 --> 00:01:29,060
Now our Machine Learning model is
going to take that input data.

18
00:01:29,060 --> 00:01:32,252
And so the word counts,
as well as some several parameters

19
00:01:32,252 --> 00:01:36,108
which I'm calling here w-hat,
which are the weights for each word.

20
00:01:40,335 --> 00:01:42,740
Each word.

21
00:01:43,900 --> 00:01:47,200
And from, combining these two,
we're gonna output the predictions.

22
00:01:47,200 --> 00:01:49,720
So if the score is greater than zero,
it's gonna be positive.

23
00:01:49,720 --> 00:01:51,550
If the score is less than zero,
it's gonna be negative.

24
00:01:51,550 --> 00:01:54,713
So this output here is
the predicted sentiment.

25
00:02:01,946 --> 00:02:05,840
And if we're just using the model,
we would be done here.

26
00:02:05,840 --> 00:02:09,560
But really, in the machine learning
algorithm phase, we're gonna evaluate that

27
00:02:09,560 --> 00:02:14,000
result and then feed it back into
the algorithm to improve the parameters.

28
00:02:14,000 --> 00:02:19,130
So we're gonna take the predicted
sentiment, y-hat and

29
00:02:19,130 --> 00:02:23,070
compare it with the true label for
the sentiment.

30
00:02:23,070 --> 00:02:29,960
So the sentiment label for
each data point.

31
00:02:29,960 --> 00:02:31,620
So that's gonna fit in and

32
00:02:31,620 --> 00:02:34,840
our quality measure here is gonna
be classification accuracy.

33
00:02:38,126 --> 00:02:42,639
Classification accuracy.

34
00:02:42,639 --> 00:02:46,510
And the machine learning algorithm,
which we're gonna discuss in more detail

35
00:02:46,510 --> 00:02:50,580
in the classification course, is gonna
take that accuracy and try to improve it.

36
00:02:50,580 --> 00:02:55,340
And the way the improvement works,
is by updating the parameter w-hat.

37
00:02:55,340 --> 00:02:57,132
And that's what the cycle for

38
00:02:57,132 --> 00:03:00,060
machine learning algorithm
classification would look like.

39
00:03:01,350 --> 00:03:04,180
In this module,
we've seen how to do classification.

40
00:03:04,180 --> 00:03:07,860
We've looked at various examples
of where it can be applied.

41
00:03:07,860 --> 00:03:11,375
We'll talked about a few models for
building classifications,

42
00:03:11,375 --> 00:03:14,505
especially in the context of sentiment
analysis, we saw some live demos.

43
00:03:14,505 --> 00:03:19,115
And we even built a notebook where
we built a classifier from data and

44
00:03:19,115 --> 00:03:20,435
analyzed it.

45
00:03:20,435 --> 00:03:24,642
And with this knowledge, you're ready
to build an intelligent application

46
00:03:24,642 --> 00:03:26,592
that uses a classifier at its core.

47
00:03:26,592 --> 00:03:30,649
[MUSIC]