[MUSIC] So classifiers are really trying to make decisions. Decisions as to whether a sentence is positive or negative, or whether a set of lab tests plus x-rays plus measurements lead to a certain disease like flu or cold. That's a decision that needs to be made. So let's talk a little bit about how classifiers especially linear classifiers make decisions. To understand decision boundaries, suppose you only had two words with non-zero weight. You have awesome with positive weight of one and awful which is just awful so it has a negative weight of 1.5. If you have this situation then the score is gonna be 1 times the number of awesomes in the sentence, minus 1.5 times the number of awfuls. So we can plot this on an axis, there's the awesome axis and then there's the awful axis. So for example, the sentence, the sushi was awesome, the food was awesome, but the service was awful. Then that has two awesomes and one awful. It's plotted in the point (2,1) on the axis. And similarly for something that has say, three awfuls and one awesome, and for something that is all awesome, so three awesomes is at a point (3,0), and so on for other sentences. Now, let's understand a little better how we scored the sentences and what does that imply about our decisions. So, for example, take the point (3,0) as 3 awesomes, no awfuls. Three awesomes gives you a positive prediction because the score is greater than zero. And that is true for every point really on the bottom right of the axis. While the points on the top left all have score less than zero so for example the point three awfuls, one awesome got score less than zero. So, those get labeled negative. And, in fact, what separates the negative predictions from the positive predictions is the line that defines the places where I don't know what's positive and what's negative, and that's the line where 1.0 #awesome- 1.5 #awful = 0. And that's the line when, I don't know, the prediction is uncertain, and so we call that the decision boundary. Everything on one side we predict is positive, everything on the other we predict is negative. Now notice that the decision boundary, 1.0 times awesome minus 1.5 times awful equals to 0 is a line. And so that's why it's called a linear classifier. It's a linear decision boundary. So decision boundaries are what separate the positive predictions from the negative predictions. So, in the case of just two features, we see this is just a line. But that situation will differ as we increase the number of features. So in two dimensions, a linear function is a line. In three dimensions, we have three, words for example that have non-zero weight and everything else has zero weight and we get to the plane. And so it's a little hard to draw in 3D, but the positive predictions here are above the plane and the negative predictions here are below the plane and the plane is somehow inclined in the space. Now when you have not just three non-zero words but in re-application you're gonna have tens of thousands of words with non-zero weight. And in that case we'll call those hyperplanes, really high dimensional separators called hyperplanes. Now, of course you can use more than linear classifiers. You can use more complex classifiers. And those, instead of having lines or hyperplanes, they have more complicated shapes or squiggly separations. And we're gonna learn more about that in the classification course. >> [MUSIC]