[MUSIC] We'll start with a high level outline of the learning algorithm for decision trees and then we'll dig into some core details in order to make it happen. So typically, we'll start with an empty tree where you have some data, some loans labeled as safe, others labeled as risky. And this node in the tree, which is called the root node contains the whole data set and what we show here is a histogram of the number of vales that have safe or risky loans. So this can give you all the data, then I'm going to pick a feature to split on, split the tree. So for example here, I'm splitting the tree on credit and credit can take three possible values. So, I can take the subset of data that corresponds to excellent credit and just feed it to here the left side of that tree. I can take the subset of data that corresponds to fair credit value, which corresponds middle side of the tree and then poor is the subset of data to the right side of the tree. And so as you can see there, there's more data points with say, poor credit than there are those with excellent credit. Now that we've assigned subsets of data to which one of the subsequent nodes, we can visualize that data using histograms again. So the top histogram here corresponds to everything all the data, but the histogram on the branch here on the left corresponds to the data points where credit was excellent. The ones in the middle corresponds to data points where credit was fair and the ones on the right corresponds to data points where credit was poor. As you can see, the set where credit was excellent has a lot of safe loans. Where the one where credit was poor has more risky than it has safe loans. And next, we'll continue splitting each one of these branch, each one of the datasets. However, if you consider this first node, the node where the credit was excellent you see that all of the examples in this node are labeled as safe. These are all safe loans. So, there's no point in keep on splitting. We just predict that everything that has excellent credit is a safe loan. Now for the other two, we need to keep on going. So for example, for the loans where we have a credit rating of fair. We're going to build the next tree going down from there. And for those data points where the credit rating was poor, again, we'll keep on building trees from there. So, that's the recursive greedy algorithm that we're going to discuss and build today. We can now discuss at a high-level the greedy algorithm for learning a decision tree and it is a very simple algorithm. I start let's say, for empty tree and I pick a feature to split on. In our case, we split on credit. So we decided to say, take my data and split on which data points have excellent credit, which ones have fair credit and which ones have poor credit. And then for each subset of data, excellent, fair, poor, I continue thinking about what to do next. So I decide in the case of excellent credit, there was nothing else to do. So I stop, but in the other two cases, there was more to do and what I do is what's called a recursion. I go back to step two, but only look at the subset of the data that has credit fair and then only look at subset of data that has credit poor. Now if you look at this algorithm so far, it sounds a little abstract, but there is a few points that we need to make more concrete. We have to decide how to pick the feature to split on. We split on credit in our example, but we could have split on something else like the term of the loan or my income. And then since we having recursion here at the end, we have to figure out when to stop recursion, when to not go and expand another node in the tree. So, we'll discuss these two important tasks in the rest of this module. [MUSIC]