[MUSIC] In the regression model, we talked about predicting house prices and fitting a regression model for that and we measured the error in terms of sum-squared errors. Here in classification, our errors are a little different because we are talking about what inputs we get correct and which inputs we get wrong. So let's talk a little bit about measuring error in classification. So when I learn a classifier, I'm given a set of input data. So these are sentences that have been marked to say positive or negative sentiment, and as in regression, we split it into a training set and a test set. I feed the training set to the classifier I'm trying to learn and that algorithm is actually going to learn the weights for words. So for example it's going to learn that good has a weight 1.0. Awesome, 1.7. Bad, -1.0. And awful, -3.3. And then, these weights are going to be used to score every element in the test set and evaluate how good we're doing in terms of classification. So lets talk about what that evaluation looks like. Let's discuss how we measure error, in fact, classification error, when we're doing this classification. So we're getting a set of test examples of the form, sushi was great, is a positive sentence, and we're trying to figure out how many of these test sentences we get correct and how many do we get make mistakes on. So what we are going to do is take the sentence sushi is great and feed it through the classifier, through the learned classifier. But we don't want the learned classifier to actually see the true label. We're gonna see if it gets the true label right. So we're gonna hide that true label. So the sentence gets fed to the learned classifier while the true label is hidden. And now given the sentence, we're gonna predict y hat as being positive. So we leave this as a positive sentence and so, we've made a correct prediction. So the number of correct sentences goes up by one. Now let's take another sentence, another test example. So let's say you say the food was okay as a negative sentence. So that's a bit of a ambiguous sentence but it's been labeled as negative in the training set. So now I feed the sentence to the classifier, I hide the label. And let's see what the classifier does. In this case, cuz the food was okay can be revealed as positive, maybe it makes a prediction that this is positive sentence I made a mistake, cuz the true label is negative. So say hey, mistake was made. We now have one more mistake. So we have one correct classification and one mistake. Now, we do this for every sentence in the corpus. There are two common measures of quality in classification. So for example, one of them is the notion of error. So error measures, the fraction of the test examples that we make mistakes on. So what we just do is say, out of all of the sentences that are classified, how many mistakes there are made, so number of mistakes, and I divide that by the total number of test sentences. So for example if there were 100 test sentences and I made ten mistakes then our error would be 0.1 or 10%. Now the best possible error that I can make is zero basically, I make no mistakes. Now, it's common to instead of talk about error, to also talk about accuracy of your classifier. So accuracy is exactly opposite of that. So, in accuracy, instead of measuring the number of errors, we measure the number of correct classifications. So the ratio here is number of correct divided by total number of sentences. And unlike error where the best possible value is zero, in terms of accuracy, the best possible value is 1, I've got all the sentences right. And in fact there's a really natural relationship between the two. We know that error = 1- accuracy, and vise versa. [MUSIC]