[MUSIC] This is the same dataset ,and the same learn model from our fewer slides ago. And what I'm plotting here on the right is not just a decision boundary but is the probability that y hat is equal to plus one. So it's a probability plot. For other points over here, the probability is approximately zero. So approximately zero chances that the points there with minus five and four are positive. While the points over here have a probability approximately one. So the probability being y equals plus one is approximately one on the bottom right corner. So all that make sense and what makes most sense to me is that the region in between, right here that right region this is the region where the probability is approximately .5 and so what kind of uncertain as who whether were positive or negative review and it's a region that's a pretty wide region of uncertainty. So although the linear classifier the straight line here polynomial's is agree one was not a great fit to the data. The uncertainty measures makes quite a lot of sense. The points over here that were getting misclassified I'm actually uncertain about whether their positive or negative and so I can feel like this classifier is doing something very reasonable. Not let's look at degree two form on your fit. So what happens is degree two polynomial features or quadratic features and learn the same classifier as we learned a few slides ago. But again, plot the probability that y hat equals plus one. As we saw from a few slides ago, we believe that this quadratic fit was actually a better fit to the data. So if you look at it, the uncertainty region is narrower. And to me, this makes a lot of sense, I have a better fit to the data. There's less points that I'm ascertain about. And in fact, the places where I have uncertainty are exactly the ones in the boundary region where I should have some uncertainty. Ones where I'm not sure if they're plus one or minus one, they're close to the boundary, it makes a lot of sense. So this a really great fit not just in terms of that decision boundary but also in terms of the probabilities. The places where the probabilities closer to 0.5 are really the ones where I'm really unsure about what's going on. Then it's mostly decreases or it mostly increases depending of if I go to the left side or the right side of the parabola. Now let's see what happens when I use higher order features, for example polynomial degrees six feature or polynomial degrees 20 feature. We saw that those decision boundaries became really wiggly and crazy, but now if you look at the uncertainty regions, you'll see they become really really narrower. So you've gotta squint to see these, because they're really really thin. But you can see them over here kind of in the little white band, so according to this model, not only is the [INAUDIBLE] boundary this really crazy line, but the only places where I'm unsure about my prediction and this little places thin little bands in between. So there's tiny uncertainty regions and I'm overfitting and overconfident about it. The way I think about it and I say is we're sure we're right. And we're surely wrong about that. So we're absolutely wrong, but we're sure we're right, and that's really bad. And so uncertainty is something that's very important in classifiers and by looking downstairs we have another interpretation of overfitting, another way that overfitting gets expressed in classification by creating this really narrow uncertainty bands, and so we want to avoid that. We'll do everything we can to avoid it. [MUSIC]