[MUSIC] We saw how we could change the threshold from zero to one for deciding what's a positive value to navigate between the optimistic classifier and the pessimistic classifier. There's an actually really intuitive visualization that does this. It's called a precision-recall curve. Precision-recall curves are extremely useful to understanding how a classifier is performing. So in this case, you can imagine setting two points in that curve. What happens to the precision once you have, when the threshold is very close to one? Well the precision is going to be one because we're going to get everything right, there's very, very few things and very sure they're going to correct. But the recall's going to be zero because you're going to say everything's bad, everything else is bad, so that's pessimistic. On the other extreme, our precision recall curve, the point on the bottom there, is a point where the optimistic point where you have very high recall because you're going to find all the positive data points, but very low precision, because you're going to find all sorts of other stuff and say that's still good. And so that happens when t is very small, close to zero. Now if you keep varying t, you have a spectrum of tradeoffs between precision and recall. So if you want a model that has a little bit more recall but still highly precise, maybe you set t = 0.8, but if you really want really, really high recall, but trying to improve precision a little bit, maybe set t to 0.2. And you can navigate that spectrum to explore the tradeoff between precision and recall. Now there doesn't always have to be a tradeoff, if you have a really perfect classifier, you might have a curve that looks like this. This is kind of the world's ideal where you have perfect precision no matter what your recall level. This line basically never happens. But that's kind of the ideal. That's where you're trying to get to, is that kind of flat line at the top. So the more your algorithm is closer to the flat line at the top, the better it is. And so precision-recall curves can be used to compare algorithms in addition to understanding one. So for example, let's say you have two classifiers, classifier A and classifier B. And you see that for every single point, classifier B is higher than classifier A. In that case we always prefer classifier B. No matter what the threshold is, classifier B always gives you a better precision for the same recall, better precision for same recall. So B is always better. However, life is not always this simple. If there's one thing you should learn about thus far, is that life and practice tends to be a bit messy. And so, often what you observe is not classifier A and B like we saw, but it's classifier A and C like we're seeing over here. Where there might be one or more cutoff points, where classifier A does better in some regions of the precision recal curve where classifier B does better in others. So, for example, or C in this case. So for example, if you're interested in very high precision but okay with lower recall, then you should pick classifier C, because it does better in that region. It's higher up, closer to that flat line. But, if you cared about getting high recall, then you should choose classifier A. Because in the high recall regime, when you pick tease, they're smaller, then classifier A tends to do better. So you see, it's curve is higher over here. So that's kind of complexity of dealing with machine learning in the real world. Now if you just had to pick one classifier, the question is how do you decide? How do you choose between A and C in this case? And we often the single number to decide, as I was hinting at, depends on where you want to be in the precision trade off curve. And there's many metrics out there to try to do single numbers, some are called F1 measures, some called area-under-the-curve. I'm less fond of those measures, myself, for a lot of applications than I am of one that's much simpler. Which, it's called precision at k. And let me talk about that because it's a really simple measure, really useful. So, let's say that there's five slots on my website to show sentences. That's all I care about. I want to show five great sentences on my website. I don't have room for ten million, for five million, just for five. And I show five sentences there. Four were great and one sucked. I want all five to be great. So I want my precision for the five top sentences, for the top five, to be as good as possible. In this case, our precision was four out of five, 0.8. So I ended up putting a sentence in there that said, my wife tried the ramen and it was pretty forgettable. That's kind of a disappointing thing to put in. So for many applications, like recommender systems for example, where you go to a web page and somebody showing you some products you might want to buy, precision at k is a really good metric to be thinking about. [MUSIC]