1
00:00:00,124 --> 00:00:04,547
[MUSIC]

2
00:00:04,547 --> 00:00:06,952
Now, decision trees can be used for
binary classification.

3
00:00:06,952 --> 00:00:08,934
They can also be used for
multiclass classification with, basically,

4
00:00:08,934 --> 00:00:09,492
no modifications.

5
00:00:09,492 --> 00:00:13,942
So for example, let's say that in the same
setting, we are given the input as

6
00:00:13,942 --> 00:00:18,531
Loan Application and we have a classifier
that starts to predict an output, but

7
00:00:18,531 --> 00:00:21,494
it's not just whether
the loans are safe or risky.

8
00:00:21,494 --> 00:00:25,826
But some loans I just
don't even think about it.

9
00:00:25,826 --> 00:00:28,043
Danger, danger, danger, don't touch them.

10
00:00:28,043 --> 00:00:33,271
So the question is how do we make
a decision whether a loan is safe,

11
00:00:33,271 --> 00:00:34,708
risky or danger?

12
00:00:34,708 --> 00:00:38,471
Let's see an example of the rate
class classification here and

13
00:00:38,471 --> 00:00:43,446
we'll see a decision stamp, but the same
thing applies to a bigger decision tree.

14
00:00:43,446 --> 00:00:46,757
So it's input we're given a dataset,
for example,

15
00:00:46,757 --> 00:00:51,762
we might only have one feature credit and
the credit can have those three values,

16
00:00:51,762 --> 00:00:56,786
excellent, fair and poor and then the
output here can be safe, risky or danger.

17
00:00:56,786 --> 00:01:01,421
If you look at the root here,
you'll see that there's now three possible

18
00:01:01,421 --> 00:01:05,100
categories, safe loans,
risky loans and danger loans.

19
00:01:05,100 --> 00:01:09,104
And if you split on credit, we'll see
pretty naturally that some of these loans

20
00:01:09,104 --> 00:01:13,136
fall in the excellent bin, some fall in
the fair bin, some fall in the poor bin.

21
00:01:13,136 --> 00:01:15,652
And the ones that fall
in the excellent bin,

22
00:01:15,652 --> 00:01:19,129
you see that the majority here are safe,
so we predict safe.

23
00:01:19,129 --> 00:01:23,492
For the fair bin, we see the majority
is risky, so we predict risky.

24
00:01:23,492 --> 00:01:29,850
But for the poor credit and the majority's
dangerous, so we don't even touch those.

25
00:01:29,850 --> 00:01:33,970
We discussed that we can make predictions
based on the majority class for

26
00:01:33,970 --> 00:01:38,156
each one of these splits here, but
it turns out that with decision trees,

27
00:01:38,156 --> 00:01:43,107
if you don't predict the probability of a
class, it's actually very simple as well.

28
00:01:43,107 --> 00:01:46,967
So, all we have to do is look at
the fraction of the data that's

29
00:01:46,967 --> 00:01:51,598
associated with a particular label or
value for each one of these leaves.

30
00:01:51,598 --> 00:01:56,217
So for example, for
this case here where credit was poor.

31
00:01:56,217 --> 00:02:00,345
We predicted the y hat was danger,
we can actually ask for

32
00:02:00,345 --> 00:02:03,050
the probability of danger.

33
00:02:03,050 --> 00:02:07,626
Given the input x and
in our case here would say that if

34
00:02:07,626 --> 00:02:11,058
your input takes you down this branch,

35
00:02:11,058 --> 00:02:16,689
then 7 out of what is this 11 of
the data points are dangerous.

36
00:02:16,689 --> 00:02:22,905
So we'd say, the probability
of danger is 0.64 while say,

37
00:02:22,905 --> 00:02:27,407
the probability of safe
loan is 3 out of 11.

38
00:02:27,407 --> 00:02:30,663
So there's a way for
us predict both probabilities and

39
00:02:30,663 --> 00:02:34,932
the actual value of the outcome, and
both are very useful in practice.

40
00:02:34,932 --> 00:02:40,029
[MUSIC]