1
00:00:01,040 --> 00:00:05,021
So let's take any old optimal binary 
search tree for keys one through n with 

2
00:00:05,021 --> 00:00:08,479
frequencies P1 through Pn. 
The thing we're trying to prove asserts 

3
00:00:08,479 --> 00:00:12,513
that the left subtree T1 should be 
optimal for its keys, one through r minus 

4
00:00:12,513 --> 00:00:16,704
one, and the right subtree T2 should be 
optimal for its keys, r plus one through 

5
00:00:16,704 --> 00:00:18,170
n. 
So we're going to proceed by 

6
00:00:18,170 --> 00:00:20,737
contradiction, we're going to assume 
that's not true. 

7
00:00:20,737 --> 00:00:23,724
What would that mean? 
That means for one of these two 

8
00:00:23,724 --> 00:00:28,439
subproblems, either one through R - 1 or 
R + 1 through n, there's an binary search 

9
00:00:28,439 --> 00:00:32,106
tree with even smaller weighted search 
cost, search time, then T1 or T2 

10
00:00:32,106 --> 00:00:34,620
respectively. 
The two cases are totally the same 

11
00:00:34,620 --> 00:00:37,138
whether. 
In our contradiction, we assume T1 is not 

12
00:00:37,138 --> 00:00:40,705
optimal or the T2 is not optimal. 
I'm just going to prove it in the case 

13
00:00:40,705 --> 00:00:45,436
the T1 is not optimal. 
So if T1 is not optimal, there's gotta be 

14
00:00:45,436 --> 00:00:49,937
a search tree on its keys, one through r 
- 1 which is better. 

15
00:00:49,937 --> 00:00:53,800
Call that purportedly better solution T 
star one. 

16
00:00:53,800 --> 00:00:58,095
Now as usual, the thing which we're going 
to try to contradict to get the final 

17
00:00:58,095 --> 00:01:02,391
proof is we're going to exhibit a search 
tree on all of the keys, one through n 

18
00:01:02,391 --> 00:01:06,291
which is even better than T. 
The T was supposed to be optimal, so that 

19
00:01:06,291 --> 00:01:10,078
would be a contradiction. 
So how do we get our superior search tree 

20
00:01:10,078 --> 00:01:13,074
for all of the keys? 
Well, we're just going to take T and 

21
00:01:13,074 --> 00:01:16,805
we're going to do cut and paste. 
We're going to do surgery on the tree T, 

22
00:01:16,805 --> 00:01:20,987
ripping out it's left subtree T1 and 
pasting in this subtree T star one. 

23
00:01:20,987 --> 00:01:26,107
Call the resulting tree T star. 
So, to complete the contradiction and 

24
00:01:26,107 --> 00:01:30,523
therefore the proof of the optimal 
substructure lemma, all we have to show 

25
00:01:30,523 --> 00:01:34,998
is that the weighted search cost of T 
star is strictly less than that of T, 

26
00:01:34,998 --> 00:01:38,101
that would contradict the purported 
optimality of T. 

27
00:01:38,101 --> 00:01:42,636
So that's precisely what I'll show you on 
this next slide and it's going to be 

28
00:01:42,636 --> 00:01:46,480
evident if we do a suitable calculation. 
Here it is. 

29
00:01:46,480 --> 00:01:50,828
So let's begin just by expanding the 
weighted search time of our original 

30
00:01:50,828 --> 00:01:53,354
tree, capital T, the purportedly optimal 
one. 

31
00:01:53,354 --> 00:01:58,896
Let's expand its definition. 
So you have one sum n for each of the 

32
00:01:58,896 --> 00:02:04,764
items i and the weight we give to a given 
item is just its frequency p sub i. 

33
00:02:04,764 --> 00:02:11,003
And we multiply that by the search time 
of for i n T So let me now pause to tell 

34
00:02:11,003 --> 00:02:13,980
you the point of this calculation we're 
about to do. 

35
00:02:13,980 --> 00:02:16,672
So we start with the weighted search time 
in T, 

36
00:02:16,672 --> 00:02:20,737
and that's, of course expressed in terms 
of search times in this tree T. 

37
00:02:20,737 --> 00:02:25,433
What I want to show next is that can 
equally be, be well expressed in terms of 

38
00:02:25,433 --> 00:02:29,727
search times in the subtrees T1 and T2. 
That is, there's a simple formula to 

39
00:02:29,727 --> 00:02:33,907
compute the search time in T if I told 
you the search times in T1 and T2. 

40
00:02:33,907 --> 00:02:38,482
That's what's going to allow us to easily 
analyze the ramifications of the cut and 

41
00:02:38,482 --> 00:02:42,880
paste and to notice that in cutting and 
pasting in a better tree for the 

42
00:02:42,880 --> 00:02:46,480
subproblem, we actually get a better tree 
for the original problem. 

43
00:02:48,080 --> 00:02:52,549
So the right way to search time in T in 
terms of the weighted search time in T1 

44
00:02:52,549 --> 00:02:56,790
and T2, it's going to be convenient to 
bucket the items into three categories. 

45
00:02:56,790 --> 00:02:59,197
Those that are in the left subtree T1, 
i.e. 

46
00:02:59,197 --> 00:03:03,552
one through r minus one, those in the 
right subtree T2, r plus one through n, 

47
00:03:03,552 --> 00:03:06,417
and then of course left over is the root 
r itself. 

48
00:03:06,417 --> 00:03:10,200
So let's just break down the sum into its 
three constituent parts. 

49
00:03:11,260 --> 00:03:16,358
So the sum and corresponding to the root 
r contributes the frequency of r times 

50
00:03:16,358 --> 00:03:19,864
the search time for r, namely one because 
it's the root. 

51
00:03:19,864 --> 00:03:26,217
And then we have our sum, over just the 
items up to r - 1 of their frequencies 

52
00:03:26,217 --> 00:03:30,075
times their search times and similarly 
those r plus one up to n, their 

53
00:03:30,075 --> 00:03:34,713
frequencies times their search times nT. 
So for the next step, let's observe it's 

54
00:03:34,713 --> 00:03:39,640
very easy to connect search times in T 
with search times in the subtrees T1 and 

55
00:03:39,640 --> 00:03:41,890
T2. 
Suppose you were, you, example, you were 

56
00:03:41,890 --> 00:03:44,506
searching for some key in T that was in 
T1. 

57
00:03:44,506 --> 00:03:48,825
So some key that was less than r. 
What happens when you search for this 

58
00:03:48,825 --> 00:03:51,380
item in the tree T? 
Well, first you visit r, 

59
00:03:51,380 --> 00:03:55,877
it's not r, it's less than r, so you go 
left, and then you just search as 

60
00:03:55,877 --> 00:04:01,008
appropriately through the search tree T1. 
That is, the search time in the big tree 

61
00:04:01,008 --> 00:04:05,949
T is simply the search time in the 
subtree T1 + 1, 1 because you had to 

62
00:04:05,949 --> 00:04:09,560
visit the root r before you made it in to 
the subtree T1. 

63
00:04:11,200 --> 00:04:16,868
So that is, for any object i other than 
the root, we can write its search time in 

64
00:04:16,868 --> 00:04:22,840
the big tree T as simply one plus the 
search time in the appropriate subtree. 

65
00:04:22,840 --> 00:04:28,060
So now, let me expand in groups and terms 
just to clean up this mess. 

66
00:04:29,740 --> 00:04:34,055
So now that the dust has settled, let's 
inspect each of these three sums. 

67
00:04:34,055 --> 00:04:37,772
We actually have a quite simple 
understanding of each of them. 

68
00:04:37,772 --> 00:04:42,507
So, the first sum is just the sum of the 
Pi. So for example, in the conical case 

69
00:04:42,507 --> 00:04:46,284
where we're dealing with actual 
probabilities, this is just one. 

70
00:04:46,284 --> 00:04:50,839
The point is, whatever the Pis are, this 
first sum is just a constant, meaning 

71
00:04:50,839 --> 00:04:54,016
it's independent of the tree T, with 
which we started. 

72
00:04:54,016 --> 00:04:57,733
What's the second sum? 
So it's the sum of the objects from one 

73
00:04:57,733 --> 00:05:00,310
to r minus one, the frequency of ithe 
times search4I time for i in the 

74
00:05:00,310 --> 00:05:03,772
searchtree T1. 
Well, we have a much better, shorter 

75
00:05:03,772 --> 00:05:07,852
nickname for that sum. 
It's the weighted search time of the 

76
00:05:07,852 --> 00:05:12,140
search tree T1, for the objects that 
contains one through r minus one. 

77
00:05:12,140 --> 00:05:17,623
Similarly this last sum, sum from i over 
r plus one to n of the frequency of i 

78
00:05:17,623 --> 00:05:23,177
times the search time in T2, that's just 
the weighted search time in the search 

79
00:05:23,177 --> 00:05:27,494
tree T2. 
So, we did this computation with the 

80
00:05:27,494 --> 00:05:30,321
purportedly optimal search tree capital T 
in mind. 

81
00:05:30,321 --> 00:05:33,203
But if you think about it, you look at 
this algebra. 

82
00:05:33,203 --> 00:05:35,691
This, it applies to any search tree you 
like. 

83
00:05:35,691 --> 00:05:40,213
For any search tree, the weighted search 
cost can be expressed as the sum of the 

84
00:05:40,213 --> 00:05:44,508
Pis plus the weighted search cost of the 
left subtree of the root plus the 

85
00:05:44,508 --> 00:05:47,617
weighted search cost of the right subtree 
of the root. 

86
00:05:47,617 --> 00:05:52,365
In particular, that's true for this tree 
T star we got by cutting and pasting the 

87
00:05:52,365 --> 00:05:55,480
better left subtree T star one and for 
T1. 

88
00:05:55,480 --> 00:06:00,459
So applying this reasoning to T star, we 
can right its weighted search cost as the 

89
00:06:00,459 --> 00:06:05,742
sum from micro one to n of all the Pis 
plus the weighted search cost of its left 

90
00:06:05,742 --> 00:06:09,507
subtree of its root, which remember by 
construction is T star one, 

91
00:06:09,507 --> 00:06:13,880
and then it has the same right sub tree 
as the tree T does, namely T2. 

92
00:06:15,540 --> 00:06:20,324
And the point of all this algebra is that 
we now see in a very simple way what are 

93
00:06:20,324 --> 00:06:24,899
the ramifications of cutting and pasting 
in this better left subtree, we get a 

94
00:06:24,899 --> 00:06:29,052
better tree for the original key set 
contradicting the purported optimality of 

95
00:06:29,052 --> 00:06:33,048
the tree T that we started with. 
That completes the proof of the optimal 

96
00:06:33,048 --> 00:06:35,520
substructure lemma for optimal binary 
search trees.