[MUSIC] Now that we're able to pick a feature to split on, we need to decide what to do next, how to recurse, and when to decide to stop. So, our goal now has gone beyond just learning a decision stump or to start first level of decision tree or picking that first feature to split on, and learning a whole decision tree from data. So if you look at our decision stump that we learned splitting on credit, if you look at that first split where credit was excellent, we see that every single data point in there was a safe long, so there's nothing else to do. There's no reason to recourse or trying more, we just make that what's called a leaf node because all other splits will also be predicted as safe. But for the other two, for the cases where credit was fair, or the cases where credit was poor. We need to look at the subset of the data that has fair credit, and the subset of data that has poor credit, and build the next decision stump from each one of these, and then from there build the next decision stumps, and so on. So, in our example, if we were to keep going and build an extra decision stop for the data that is fair, where credit was fair, you will see that the results would be something like this. I would split on term next, that would be the best thing to do. And then if we look at data that has poor credit we will figure out the next best thing there is to split on the income, and for the points of low income everything was risky. So we stop splitting from there, so credit poor income low everything's risky no need to do dilemma decision stem. But for the other case where the credit was poor the income was high, we see there's some risky point some safe points, we do another decision stem from there. And then if you were to do that we've learned something like this, which completes our entire data. I started the decision tree. So you see now that we have branches that take us to leaves in every possible split. So what we describe here is what's called the recursive algorithm. It starts with a process where we pick the best feature to split on, then we split our data into decision stump with the selected feature, and then for each leaf of the decision stump or each node associated with it we go back and learn a new decision stump. And the question is, we keep iterating like this forever or do we stop somewhere? So what are the criteria to stop recursing is the question here. And the criteria are extremely simple. So the first criteria we've already seen. For the nodes I've selected here including that first nodes where credit was excellent every single nodes associated with data points of just one category or one class same output. So, for accident there everything will save, and for the case where the credit was fair, but the term was 3 years everything was risky. So, as we can see for those there's no point in keep on splitting, so the first stop in condition is stop splitting when all the data agrees on the value of y. There's nothing to do there. And there's a second criteria which is only happened over here where we stopped splitting, and we still had some data points with safe and risky loans inside that node. However, we had used up all of the features in our dataset. We only had three features here, credit, income, and term, and on that branch of decision tree we used all of them up. There's nothing left to split on. We get the same things if you keep splitting them forever. And so the two stopping criteria actually very simple stop if every data point agrees or stop if you run out of features. So if we go back now to our greedy algorithm for learning decision trees we see that step two you just pick the feature that minimizes the classification errors we discussed. And then we have these two stopping conditions here that we just described, two extremely simple ones, and then we just recurse and keep going until those stopping conditions are reached. Either we use all the features, use them all up or all the data points agree on the value of y. [MUSIC]