1 00:00:00,124 --> 00:00:04,547 [MUSIC] 2 00:00:04,547 --> 00:00:06,952 Now, decision trees can be used for binary classification. 3 00:00:06,952 --> 00:00:08,934 They can also be used for multiclass classification with, basically, 4 00:00:08,934 --> 00:00:09,492 no modifications. 5 00:00:09,492 --> 00:00:13,942 So for example, let's say that in the same setting, we are given the input as 6 00:00:13,942 --> 00:00:18,531 Loan Application and we have a classifier that starts to predict an output, but 7 00:00:18,531 --> 00:00:21,494 it's not just whether the loans are safe or risky. 8 00:00:21,494 --> 00:00:25,826 But some loans I just don't even think about it. 9 00:00:25,826 --> 00:00:28,043 Danger, danger, danger, don't touch them. 10 00:00:28,043 --> 00:00:33,271 So the question is how do we make a decision whether a loan is safe, 11 00:00:33,271 --> 00:00:34,708 risky or danger? 12 00:00:34,708 --> 00:00:38,471 Let's see an example of the rate class classification here and 13 00:00:38,471 --> 00:00:43,446 we'll see a decision stamp, but the same thing applies to a bigger decision tree. 14 00:00:43,446 --> 00:00:46,757 So it's input we're given a dataset, for example, 15 00:00:46,757 --> 00:00:51,762 we might only have one feature credit and the credit can have those three values, 16 00:00:51,762 --> 00:00:56,786 excellent, fair and poor and then the output here can be safe, risky or danger. 17 00:00:56,786 --> 00:01:01,421 If you look at the root here, you'll see that there's now three possible 18 00:01:01,421 --> 00:01:05,100 categories, safe loans, risky loans and danger loans. 19 00:01:05,100 --> 00:01:09,104 And if you split on credit, we'll see pretty naturally that some of these loans 20 00:01:09,104 --> 00:01:13,136 fall in the excellent bin, some fall in the fair bin, some fall in the poor bin. 21 00:01:13,136 --> 00:01:15,652 And the ones that fall in the excellent bin, 22 00:01:15,652 --> 00:01:19,129 you see that the majority here are safe, so we predict safe. 23 00:01:19,129 --> 00:01:23,492 For the fair bin, we see the majority is risky, so we predict risky. 24 00:01:23,492 --> 00:01:29,850 But for the poor credit and the majority's dangerous, so we don't even touch those. 25 00:01:29,850 --> 00:01:33,970 We discussed that we can make predictions based on the majority class for 26 00:01:33,970 --> 00:01:38,156 each one of these splits here, but it turns out that with decision trees, 27 00:01:38,156 --> 00:01:43,107 if you don't predict the probability of a class, it's actually very simple as well. 28 00:01:43,107 --> 00:01:46,967 So, all we have to do is look at the fraction of the data that's 29 00:01:46,967 --> 00:01:51,598 associated with a particular label or value for each one of these leaves. 30 00:01:51,598 --> 00:01:56,217 So for example, for this case here where credit was poor. 31 00:01:56,217 --> 00:02:00,345 We predicted the y hat was danger, we can actually ask for 32 00:02:00,345 --> 00:02:03,050 the probability of danger. 33 00:02:03,050 --> 00:02:07,626 Given the input x and in our case here would say that if 34 00:02:07,626 --> 00:02:11,058 your input takes you down this branch, 35 00:02:11,058 --> 00:02:16,689 then 7 out of what is this 11 of the data points are dangerous. 36 00:02:16,689 --> 00:02:22,905 So we'd say, the probability of danger is 0.64 while say, 37 00:02:22,905 --> 00:02:27,407 the probability of safe loan is 3 out of 11. 38 00:02:27,407 --> 00:02:30,663 So there's a way for us predict both probabilities and 39 00:02:30,663 --> 00:02:34,932 the actual value of the outcome, and both are very useful in practice. 40 00:02:34,932 --> 00:02:40,029 [MUSIC]