[MUSIC] This next section will talk about
how to pick that threshold split, H equals 38 over income with 60,000 for
continuous valued features. We're going to make this
an optional session. Its not super complicated but
is that a little bit laborious? So, if you're interested,
definitely take a deep dive, but for those who want to skip it,
it's totally okay. So the goal here is to ask,
if I decide to split, on say income, how do I choose the splitting point,
p star, that we won't separate, now case of 60,000, but we want to split
the left and right side of the tree. Now, this infinite many values that t star could take, it could be $60,000,
$50,999.99, and if this is truly continuous it could go to
infinite remaining decimal places. The question is do we need
to consider all of those. And do all those decimal places really
effect the quality of our decision tree? Now if you think about it and if you look at the values that
income can take on the data so actual of income you'll see that if
you take two points of say VA, and VB. Let's say 60,000, and 65,000. If there are no points in between, whether
the split is at 61,000, 62,000, 63,000, 64,000, you're still going to have
the same classification error. The points on the left of that
split always going to be the same, the points on the right of
the split cannot be the same. So all I have to do, is consider the middle point, between
any of the data points that we have, and just consider those to be
the possible splits for my data. And that's exactly what we're going to do. Let's now close the section by
walking through the algorithm for picking the best splitting point,
for a particular feature. So let's say that I'm considering
splitting, on the feature here, hj, which might be in our case, income, and,
what I can do is go through all my data, so the column of values of the income
might take, and sort them. Such that V1 is the lowest income,
V2 is the next lowest and VN is the highest income. And all I need to consider is the
splitting points right in between V1 and V2, V2 and V3 and so on. So I walk from i = 1 though N-1 and
then consider splitting point ti, which is the midpoint between Vi and
Vi+1, and I ask what is the classification error if I were to build a decision tree,
a decision stump in this case, that splits xj on ti,
on greater than ti and less than ti? So greater than 60,000 and
lower than 60,000. And then we'll pick. t star, to be the split that
leads to decision stump, with the lowest classification error. And that's it. Pretty simple algorithm,
pretty easy to take from here. [MUSIC]