[MUSIC] We'll start with a high level outline
of the learning algorithm for decision trees and then we'll dig into some core
details in order to make it happen. So typically, we'll start with an empty
tree where you have some data, some loans labeled as safe,
others labeled as risky. And this node in the tree, which is called the root node contains
the whole data set and what we show here is a histogram of the number
of vales that have safe or risky loans. So this can give you all the data, then
I'm going to pick a feature to split on, split the tree. So for example here,
I'm splitting the tree on credit and credit can take three possible values. So, I can take the subset of data that
corresponds to excellent credit and just feed it to here
the left side of that tree. I can take the subset of data that
corresponds to fair credit value, which corresponds middle
side of the tree and then poor is the subset of data
to the right side of the tree. And so as you can see there,
there's more data points with say, poor credit than there are those
with excellent credit. Now that we've assigned subsets of data
to which one of the subsequent nodes, we can visualize that data
using histograms again. So the top histogram here corresponds
to everything all the data, but the histogram on
the branch here on the left corresponds to the data points
where credit was excellent. The ones in the middle corresponds to
data points where credit was fair and the ones on the right corresponds to
data points where credit was poor. As you can see, the set where credit
was excellent has a lot of safe loans. Where the one where credit was poor
has more risky than it has safe loans. And next, we'll continue splitting
each one of these branch, each one of the datasets. However, if you consider this first node,
the node where the credit was excellent you see that all of the
examples in this node are labeled as safe. These are all safe loans. So, there's no point in keep on splitting. We just predict that everything that
has excellent credit is a safe loan. Now for the other two,
we need to keep on going. So for example, for the loans where
we have a credit rating of fair. We're going to build the next
tree going down from there. And for those data points where
the credit rating was poor, again, we'll keep on building trees from there. So, that's the recursive greedy algorithm
that we're going to discuss and build today. We can now discuss at a high-level
the greedy algorithm for learning a decision tree and
it is a very simple algorithm. I start let's say, for empty tree and
I pick a feature to split on. In our case, we split on credit. So we decided to say, take my data and
split on which data points have excellent credit, which ones have fair
credit and which ones have poor credit. And then for each subset of data,
excellent, fair, poor, I continue thinking
about what to do next. So I decide in the case of excellent
credit, there was nothing else to do. So I stop, but in the other two cases,
there was more to do and what I do is what's called a recursion. I go back to step two, but
only look at the subset of the data that has credit fair and then only look
at subset of data that has credit poor. Now if you look at this algorithm so
far, it sounds a little abstract, but there is a few points that we
need to make more concrete. We have to decide how to pick
the feature to split on. We split on credit in our example, but we could have split on something else
like the term of the loan or my income. And then since we having recursion
here at the end, we have to figure out when to stop recursion, when to not go and
expand another node in the tree. So, we'll discuss these two important
tasks in the rest of this module. [MUSIC]