So 
now that we understand the structure of 
optimal solutions for this optimal binary 
search tree problem. 
That is, we understand how an optimal 
solution must be one of a relatively 
small number of candidates. 
Let's compile that understanding into a 
polynomial time dynamic programming 
algorithm. 
Let me quickly remind you of the Optimal 
Substructure Lemma that we proved in the 
previous video. 
Suppose we have an optimal binary search 
tree for a given set of keys, one through 
N, with given probabilities. 
And suppose this binary search tree has 
the root R. 
Well then it has two sub-trees, t1 and 
t2. 
By the search tree property, we know 
exactly the population of each of those 
two sub-trees. 
T1 has to contain the keys one through r 
- 1. 
As usual we're sorted, we're assuming 
these are in sorted order. 
Whereas the right sub-tree t2, has to 
contain exactly the keys r + 1 through n. 
Moreover, t1 and t2 are, in their own 
right, valid search trees for these two 
sets of keys. 
And finally, and this is what we proved 
in the last video they're optimal for 
their respective sub-problems. 
T1 is optimal for keys one through r 
minus one and the corresponding weights 
or. 
Abilities and T2 is optimal for R plus 
one through N and their corresponding 
frequencies. 
So let's now execute our dynamic 
programming recipe. 
So now that we understand the way in 
which an optimal solution must 
necessarily be composed in a simple way 
from solutions to smaller subproblems. 
Let's take a step back, and ask, well. 
Given that, at the end of the day, we 
care about the optimal solution to the 
original problem. 
Which subproblems are relevant? 
Which subproblems are we going to be 
forced to solve? 
For example, with independent sets in 
line graphs we observed that to solve a 
subproblem we needed to know the answers 
to the subproblems where we pluck either 
one or two vertices off of the right hand 
side. 
So overall what we cared about was 
subproblems corresponding to prefixes of 
the graph. 
In the knapsack problem we needed to 
understand subproblems that involved one 
less item and possibly a resus, reduced 
residual knapsack capacity, so that led 
to us caring about solutions to 
subproblems corresponding to all prefixes 
of the items and all integer 
possibilities for the residual capacity 
of a knapsack. 
In sequence alignment, when we looked at 
subproblems. 
As we were plucking a character off of 
one or possibly both of the strings. 
So we cared about subproblems 
corresponding to prefixes of each of the 
two strings. 
Now, here's one of the things that's 
interesting about the binary search tree 
problem which we haven't seen before. 
Is that, when we look at a subproblem. 
In the optimal structure lemma, there's 
two that we might consider. 
We don't just pluck off from the right. 
We care about both the subproblem induced 
by the left subtree. 
And that induced by the right subtree. 
In the first case, we're looking at a 
prefix of the items we started with. 
And that's like we've seen in our many 
examples. 
But in the second case, the sub problem 
corresponding to t sub two. 
That's actually a suffix of the items 
that we started with. 
So put differently, the sub-problems we 
care about are those that can be obtained 
by either throwing away a prefix from the 
items that we started with or throwing 
away a suffix from the items that we 
started with. 
So in light of this observation, that the 
value of an optimal solution depends only 
immediately on sub problems that you 
obtain by throwing out a prefix with a, 
or a suffix of the items, what I want you 
to think about on this quiz is, what is 
the entire set of relevant sub problems? 
That is, for which subsets s of the 
original items one through n is it 
important that we compute the value of an 
optimal binary search tree on the items 
only in s? 
So before I explain the correct answer 
which is the third one, let me talk a 
little bit about a very natural but 
incorrect answer, namely the second one. 
Indeed, the second answer seems to have 
the best correspondence to the optimal 
substructure lemma. 
The optimal substructure lemma states 
that the optimal solution must be 
composed of an optimal solution on some 
prefix and an optimal solution on some 
suffix, united under a common root r. 
So we definitely care about the solutions 
to all prefixes and suffixes of the items 
but we care about more than just that. 
So maybe the easiest way to see that is 
to think about the recursive application 
of the optimal substructure lemma. 
And again relevant subproblems at the end 
of the day are going to correspond to all 
of the different distinct subproblems 
that ever get solved over the entire 
trajectory of this recursive 
implementation. 
So, I mean just think about one sort of 
example path in the recursion tree, 
right? 
So in the outermost level recursion 
you've got the whole item set, let's say 
there's 100 items one through 100, you're 
going through and trying all 
possibilities of the root. 
So at some point you're trying out root 
number 23 to see how it does. 
You have to recurse once on items one 
through 22 to optimally build a search 
tree for them, and similarly for items 24 
through 100. 
Now let's sort of drill down into this 
first recursive call where you recurse on 
item just one through 22. 
Now here again, you're going to be trying 
all possibilities of the route, those 22 
choices. 
At some point you'll be trying route 
number seventeen. 
There's again going to be two recursive 
calls. 
And the second recursive call is going to 
be on items eighteen through 22. 
It's going to be the items that were 
passed through this recursive call. 
A prefix of the original items. 
And then the second recursive call here, 
locally is going to be on some suffix of 
that prefix. 
So in this case, the items eighteen 
through 22. 
A suffix of the original prefix, one 
through 22. 
So, in general, as you think through this 
recursion multiple levels. 
At every step, what you've got going for 
you is, you're either deleting a chunk of 
items from the beginning, a prefix. 
Or you're deleting a chunk of items from 
the end. 
But you might be interleaving these two 
operations. 
So it is not true that you're always 
going to have a prefix of a suffix of the 
original set of items. 
But. 
What is true is that you will have some 
contiguous set of items. 
It's going to be. 
If you, if you have I as your smallest 
item in the subproblem and J is the 
biggest, you're going to have all of the 
subproblems in between. 
And that's because you were only plucking 
off items from the left or from the 
right. 
So that's why C is the correct answer. 
You need more subproblems than just 
prefixes and suffixes. 
Alright, so that was a little tricky, 
identifying the relevant sub problems but 
now that we've got them in our grubby 
little hands the dynamic programming 
algorithm as usual is just going to fall 
in to place, the relevant collection of 
sub problems unlocks the power in a very 
mechanical way of its entire paradigm. So 
let's now just fill in all the details. 
The first step is to formalize the 
recurrence. 
That is, the way in which the optimal 
solution of a given subproblem depends on 
the value of smaller subproblems. 
This is just going to be a mathematical 
formula which encodes what we already 
proved in the optional substructure 
lemma. 
And then we're going to use this formula 
to populate a table in a dynamic 
programming algorithm to solve, 
systematically, the values for all of the 
subproblems. 
So let's have some notation to put in our 
recurrence, in our formula. 
We're going to be indexing sub-problems 
with two indices I and J and this is 
because we have two degrees of freedom 
where the continuous interval of item 
starts I, and where the continuous 
interval of items ends, J. 
So for a given choice I and J, where of 
course I should be the most J. 
I'm going to denote by capital C sub IJ, 
the weighted search cost of an optimal 
binary search tree just in the contiguous 
set of in, items from I to J. 
And of course, the weights or the 
probabilities are exactly the same as in 
the original problem they're just 
inherited here, PI through PJ. 
So now let's state the recurrence. 
So, for a given sub problem cij, we're 
going to express the value of an optimal 
binary search tree in terms of those of 
smaller sub problems. 
The optimal sub structure lemma tells us 
how to do this. 
The optimal substructure lemma says, that 
if we knew the roots, if we know the 
choice of the root r which here is going 
to be somewhere between the items I and 
j, then in that case, the optimal 
solution has to be composed of optimal 
solutions to the two smaller sub-problems 
united under the root. 
Now we don't know what the root is. 
There's a j - I + one possibilities. 
It could be anything between I and j 
inclusive. 
So as usual, we're just going to do brute 
force search over the relatively small 
set of candidates that we've identified. 
So brute force search we encode by just 
explicitly having a minimum. 
So I choose some route R somewhere 
between I and J inclusive. 
And given a choice of R we're going to 
inherit the weighted search cost of the 
optimal solution on just the prefix of 
items I through R minus one. 
So on our notation that would be C of I. 
R minus one. 
Similarly we pick up the weighted search 
cost of an optimal solution to the suffix 
of items R plus one through J. 
And if you go back to our proof of the 
optimal substructure lemma you'll see we 
did a calculation which gives us a 
formula for what, how the weighted search 
cost of a tree depends on that of its 
subtrees. 
And in addition to the weighted search 
cost contributed by each of the two 
search trees we pick up a constant, 
namely the sum of all of the 
probabilities in the items we're working 
with. 
So here that's sum of. 
P sub K, where K ranges from the first 
item in the sub problem I to the last 
item in the sub problem J. 
So one extra edge case we should deal 
with is if we choose the root to be the 
first item, then the first recursive term 
doesn't make sense, then we'll have C, I, 
I minus one, which is not defined. 
Similarly, if we choose the root to be J, 
then this last term would be C of J plus 
1J which is not defined. 
Remember indices are supposed to be in 
order. 
So in that case, we'll just interpret 
these capital C's as zero. 
And so why is the recurrence correct? 
Well all of the heavy lifting was done 
and our proof of the optimal substructure 
lemma. 
What did we prove there? 
We proved the optimal solution has to be 
one of just J minus I plus one possible 
things. 
It depends only on the choice of the 
root. 
Given the root, the rest is determined 
for us. 
The recurrency is by definition doing 
brute force search through the only set 
of candidates. 
So therefore, it is indeed a correct 
formula for the optimal solution value, 
in terms of optimal solutions to smaller 
sub problems.