[MUSIC] Let's start by reviewing the intuition
behind linear classifiers. The same intuition we
covered in the first course. A linear classifier will
take us in input some quantity x which in our case
is sentences from reviews. It's going to feed it through
it's classifier model and is going to make a prediction y that says,
is this a positive review, in which case y hat is plus one, or is it a negative
review in which case y hat is minus one. That's what we're trying to figure out. A linear classifier does a little bit
more, associates every word for weight or coefficient which says how positively
influential this word is or how negatively influential. So good might have a coefficient of 1.0,
great might have a coefficient of 1.5. Awesome, is awesome and
has a coefficient 2.7. Well in the negative side, might,
bad might have a coefficient of minus 1, terrible minus 2.1. But awful, is just awful, so minus 3.3. And then some words are not that relevant
to the sentiment of the review might have 0 coefficient. Now let's see how these coefficient's can
be used to make a prediction of whether a sentence is positive or negative. So for example let's take this
sentence that says the sushi's great. So how do you score the sentence? Let's compute the score of this
imput sentence xy, x1, xi. The sentence says, the sushi is great, so we look at the coefficient of great,
and we see it's 1.2. And now it says, the food was awesome,
so the coefficient of that is 1.7. And then it says, but,
the service was terrible. My God, the service was terrible. So you subtract 2.1. And now we ask,
what is the total score for this sentence? So some things are positive,
some things are negative. The total score is 0.8,
which is greater than 0. And that implies that we're
going to predict that y hat, the sentiment for
the sentence is plus one. So it's a positive review. And this is called a linear
classifier because the output is the weighted sum of the inputs. So that's kind of what
a linear classifier is. We'll see in a little bit more
details what does that really means. So more generally a simple
linear classifier which we're going to take as input
coefficient associated with each word. And it's going to compute a score for
that input. If the score is greater than zero, we say
that the output, the prediction y hat, is +1. And if the score is less than zero,
we say that the prediction is -1. Now, what we need to do is
train the weights of these linear classifiers from data. So given some input training data
that includes sentences of reviews labeled with either plus one or
minus one, positive or negative. We're going to split those into some
training set and some validation set. Then we're going to feed that training
set to some learning algorithm which is going to learn the weights
associated with each word, so 1.0 for good, 1.7 for
awesome and so on. And then after we learn this classifier,
we're going to go back and evaluate its accuracy
on that validation set. So our goal for
today is to explore that learning box. How do we learn this
classifier from data and understand a little bit more deeply of
what a linear classifier is really about? In particular,
in the context of logistic regression. [MUSIC]