[MUSIC] >> Let's start talking about
how do you compute w hat t. And this quantity's intuitive and has to
know how good, or how much we trust ft. The classify we learn that's
[INAUDIBLE] iteration. So, specifically,
if ft is good, we like it, it's doing well in our data,
we want w hat t to be large. In fact, if ft has really,
really great accuracy, very low error, we want wt to be really big. However if ft is really bad, if it really is terrible at making
predictions, we should down weight it. We should not trust that particular vote. So in other words, how do we measure
whether a classifier's good or not? As we said, we said ft is good
if he has low training error. However you have to remember
that we have weighted data. So what we really care about is
how well it's doing weighted data. For example, if we're weighing more
certain datapoints because they're really high, they're making lots of mistakes
on those, we want to make sure that the classifier has low error
on those really hard mistakes. And so let's look at measuring
error on weighted data. Measuring error and weighted data is very similar to
measuring error in regular data. You have a data point. For example, the sushi was great and is labeled as positive,
but now we have a weight. In this case, alpha, which might be 1.2. So this is a data point which
is say above average importance. So we want to measure the weighted
total of the correct examples and the weighted total of the mistakes. So we take our learned classifier, f of t,
and we feed that review, in this case, the sushi was great, but we hide the
label, which in this case was positive. And now we compare the prediction. For example, let's say that y
hat was plus one for this input. It's the same as the two labels correct,
so we add the weight 1.2 to the weight
of other correct examples we've seen. So that's awesome. But let's say we have another data point. The food was OK,
which is labeled truly as negative, and we talked about this example before. We feed food was OK to a classifier. We hide the label. Minus 1.
But our classifier gets confused. It doesn't know the cultural reference,
the food was OK, and thinks it is a positive example,
y hat is plus 1, and it's a mistake. So we take the weight of
this data point 0.5 and add it to the total
weight of the mistakes. So keep adding the weight of the mistakes
versus the weight of the correct classifications. And use that to measure the error. Now that we have seen and intuitive notion of what a weighted error
is, let's write the down the equations for the weighted error, so
we can be sure if we need to implement it. So, the first thing we need to measure is
the total weight of all the mistakes, so the sum of our mistakes of
the weight of those data points. So this is the sum over the datapoint,
so i equals 1 through N, of an indicator that says,
was this a mistake? So is y hat i different than yi? So this just measure whether it was
a mistake, and if it was a mistake we don't just count it as a mistake, we count
it whatever weight that datapoint has. So we're going to weigh that
contribution by alpha i. And now, to compute the error,
we're going to normalize it so it's a number between zero and one, so we have to divide it by
the total weight of all the data points. So it's the sum over i equals 1 through N
of the weight of all the data points of i. And these are the two
quantities that we care about, and the weighted error can
be denoted by the total weight of the mistakes divided by
the total weight of all data points. Extremely simple, the best possible
error you could hope for is 0.0. Now, the worst error is 1.0, which means that we're
making mistakes everywhere. But notice that if we're
making mistakes everywhere, if we emerge a class fire we're
going to get everything right. So the way to think about it is in
the worst possible case in some ways is how random does. So a random classifier
will get error of 0.5, and we discussed this in the first course
of how a random classifier gets error 0.5 on a binary
classification problem like this. So now that we've seen the weighted error,
let's look at how we can update the coefficient w hat
t of the function that we learn. >> [MUSIC]