[MUSIC] Now, decision trees can be used for
binary classification. They can also be used for
multiclass classification with, basically, no modifications. So for example, let's say that in the same
setting, we are given the input as Loan Application and we have a classifier
that starts to predict an output, but it's not just whether
the loans are safe or risky. But some loans I just
don't even think about it. Danger, danger, danger, don't touch them. So the question is how do we make
a decision whether a loan is safe, risky or danger? Let's see an example of the rate
class classification here and we'll see a decision stamp, but the same
thing applies to a bigger decision tree. So it's input we're given a dataset,
for example, we might only have one feature credit and
the credit can have those three values, excellent, fair and poor and then the
output here can be safe, risky or danger. If you look at the root here,
you'll see that there's now three possible categories, safe loans,
risky loans and danger loans. And if you split on credit, we'll see
pretty naturally that some of these loans fall in the excellent bin, some fall in
the fair bin, some fall in the poor bin. And the ones that fall
in the excellent bin, you see that the majority here are safe,
so we predict safe. For the fair bin, we see the majority
is risky, so we predict risky. But for the poor credit and the majority's
dangerous, so we don't even touch those. We discussed that we can make predictions
based on the majority class for each one of these splits here, but
it turns out that with decision trees, if you don't predict the probability of a
class, it's actually very simple as well. So, all we have to do is look at
the fraction of the data that's associated with a particular label or
value for each one of these leaves. So for example, for
this case here where credit was poor. We predicted the y hat was danger,
we can actually ask for the probability of danger. Given the input x and
in our case here would say that if your input takes you down this branch, then 7 out of what is this 11 of
the data points are dangerous. So we'd say, the probability
of danger is 0.64 while say, the probability of safe
loan is 3 out of 11. So there's a way for
us predict both probabilities and the actual value of the outcome, and
both are very useful in practice. [MUSIC]