[MUSIC] Thus far we've talked about precision,
recall, optimism, pessimism. All sorts of different aspects. But one of the most surprising things
about this whole story is that it's quite easy to navigate from a low precision
model to a high precision model from a high recall model to a low recall model,
so kind of investigate that spectrum. So we can have a low precision,
high recall model that's very optimistic, you can have a high precision, low recall
model that's very pessimistic, and then it turns out that it's
easy to find a path in between. And the question is, how do we do that? If you recall from earlier in this course,
we assign not just a label, +1 or -1, for every data point, but
a probability number, let's say, 0.99 or being positive for the sushi and
everything else were awesome. To say 0.55 of being positive for
the sushi was good, the service was okay. This probability is I,
as I mentioned earlier in the course, that they are going to
be fundamentally useful. Now you're going to see a place
where they are amazingly useful. So the probabilities can be used
to tradeoff precision with recall. And so let's figure that out. So earlier in the course, we just had
a fixed threshold to decide if an input sentence, x-i,
was going to be positive or negative. We said, it's going to be positive if
the probability is greater than 0.5, and is going to be negative if
the probability is less than 0.5, or less than or equal to it. Now, how can we create an optimistic and pessimistic model by just
changing the 0.5 threshold? Let's explore that idea. Think about what would happen
if we set the threshold, instead of being 0.5 to being 0.999. So a data point is only +1 if its
probability is greater than 0.999. Well, here's what happen. Very few data points would satisfy this. So if very few data points satisfy this,
then very few data points will be +1. And the vast majority
will be labeled as -1. And we call this classifier
the pessimistic classifier. Now alternatively, if we change
the threshold to be 0.001, then that means that any experience is going
to be labeled as positive. So, almost all of the data points
are going to satisfy this condition. So we're going to say that. Everything is +1, and so this is
going to be the optimistic classifier. It's going to be say yeah,
everything is +1, everything's good. So by varying that threshold
from 0.5 to something close to 0 to something close to 1, we're going to
change between optimism and pessimism. So if you go back to this picture of
logistic regression, for example, as a complete case. We have this input, the score of x. And the output here was the probability
that y is equal to +1 given x and w. This should bring some memories
maybe some sad, sad memories. The threshold here is going to
be a cut where we say, set y hat to be equal to +1 if it's
greater than or equal to this threshold t. So everything above
the line will be +1 and everything below the line
will be labeled -1. Or concretely, let's see what happens if
we set the threshold to be some very, very high number. So, t here is close to one. So if t is some number close to one, then everything below that
line will be labeled as -1. And very, very few things there above
the line here can be labeled as +1. So, that's why we end up with
a pessimistic classifier, on the flip side if we set the t
threshold to be something very, very small, so this is small t then
everything's going to be above the line. So everything's going to be labeled as +1,
and very few data points
are going to be labeled as -1. So we'll end up with
the optimistic classifier. So range in t from 0 to 1,
takes us from optimism to pessimism. In other words that spectrum that we
said weren't navigate on can now be navigated for a single parameter,
t, that goes between 0 and 1. [MUSIC]