[MUSIC] Let's start by reviewing the intuition behind linear classifiers. The same intuition we covered in the first course. A linear classifier will take us in input some quantity x which in our case is sentences from reviews. It's going to feed it through it's classifier model and is going to make a prediction y that says, is this a positive review, in which case y hat is plus one, or is it a negative review in which case y hat is minus one. That's what we're trying to figure out. A linear classifier does a little bit more, associates every word for weight or coefficient which says how positively influential this word is or how negatively influential. So good might have a coefficient of 1.0, great might have a coefficient of 1.5. Awesome, is awesome and has a coefficient 2.7. Well in the negative side, might, bad might have a coefficient of minus 1, terrible minus 2.1. But awful, is just awful, so minus 3.3. And then some words are not that relevant to the sentiment of the review might have 0 coefficient. Now let's see how these coefficient's can be used to make a prediction of whether a sentence is positive or negative. So for example let's take this sentence that says the sushi's great. So how do you score the sentence? Let's compute the score of this imput sentence xy, x1, xi. The sentence says, the sushi is great, so we look at the coefficient of great, and we see it's 1.2. And now it says, the food was awesome, so the coefficient of that is 1.7. And then it says, but, the service was terrible. My God, the service was terrible. So you subtract 2.1. And now we ask, what is the total score for this sentence? So some things are positive, some things are negative. The total score is 0.8, which is greater than 0. And that implies that we're going to predict that y hat, the sentiment for the sentence is plus one. So it's a positive review. And this is called a linear classifier because the output is the weighted sum of the inputs. So that's kind of what a linear classifier is. We'll see in a little bit more details what does that really means. So more generally a simple linear classifier which we're going to take as input coefficient associated with each word. And it's going to compute a score for that input. If the score is greater than zero, we say that the output, the prediction y hat, is +1. And if the score is less than zero, we say that the prediction is -1. Now, what we need to do is train the weights of these linear classifiers from data. So given some input training data that includes sentences of reviews labeled with either plus one or minus one, positive or negative. We're going to split those into some training set and some validation set. Then we're going to feed that training set to some learning algorithm which is going to learn the weights associated with each word, so 1.0 for good, 1.7 for awesome and so on. And then after we learn this classifier, we're going to go back and evaluate its accuracy on that validation set. So our goal for today is to explore that learning box. How do we learn this classifier from data and understand a little bit more deeply of what a linear classifier is really about? In particular, in the context of logistic regression. [MUSIC]