[MUSIC] We'll now take a few minutes to instantiate this abstract algorithm we described and see what it looks like in the context of learning decision stumps. There is other boosted decision stumps, it's a really nice simple default way of training on your data. And so, it's one that we'll just go through a little further and it'll help us ground some of the concepts that we looked at so far. So here I've outlined the other boost algorithm that we discussed before in the previous slides, but just to be clear we're going to talk about linear decision stump for f of t, and figuring out how to update it, figure out its coefficient, w hat t. Our first step is figuring out how to learn the next decision stump, that's going to be f of t. And this is just going to be like standard decision stump learning. We're going to try splitting it on each feature, income, credit history, savings, market conditions. And figure out how well each of the resulting decisions stumps are a way to data. And notice that in our process we might split on income multiple times. So in multiple considerations we might revisit the same feature. So we're going to try each of those features, and for each one of them, measure the weighted error on string data. So for say splitting on income, the weight of the error might be 0.2. For splitting on credit, it might be 0.35. For splitting on savings, it might be 0.3. And finally, if you split the market conditions, it might be the worst of these four decision stumps. On this weighted data, it might have a weighted error of 0.4. So, we're picking the best feature, the one that has lowest weighted error, and so we're going to pick that first one split on. We're going to split on income, we'll get to the 100,000. And so f of t's, going to be that decision stamp, that says income's good to 100,000, if yes, safe, if not, is a risky loan. Now, the final question is, what coefficient we give to this particular classifier? So all we have to is plug 0.2 into the formula, and if we plug it in and do the math, 0.69 is the result. So the coefficient of this first decision stem is just going to be 0.69. Going back to the algorithm, we discussed how we are going to learn this new stuff from data and how we figure out its coefficient. Let's next talk about how to update the weight of i of each data point. So here's the intuitive process what happens. We have our data points and I'm highlighting them here depending on their income just like we did before. But I'm going to make a prediction using this decision stamp. The question is how good is this decision stamp income greater than 100,000, and if you look at it, it makes mistakes in some of the data points and get others right. So I marked the correct ones in bright green and mistakes in bright red. And if we take the previous weights, alpha for each one of these data points, I'm going to highlight where those weights were right there. We need to compute the new weight based on the formula above, which is standard formula. So, we're going to plug in the w hat that we computed, 0.69, into the formula to figure out what to multiply each one of those weights by. So, plug it in. You'll see that each of the -0.69 is a half, so for every correct data point one is half its weight, and each of the 0.69 is two. So for every incorrect data point, we're going to double its weight. So I'm going to go row by row, for the ones in green I got correct, I'm going to half the weights. So for the first row there the weight before was 0.5 and now becomes 0.25. The next one was 1.5 becomes 0.75 because of their correct. For third row I made a mistake, its weight before was 1.5. Now I'm going to double it, I'm going to make it 3. So it can go datapoint by datapoint and then multiplied by two, or divided by two, the weight, depending on whether we got that data point right or not. Extremely simple to be able to boost the decision stump classifier, and these tend do extremely well on a wide range of data sets. [MUSIC]