1 00:00:00,500 --> 00:00:07,029 So now that we understand the structure of 2 00:00:07,029 --> 00:00:10,533 optimal solutions for this optimal binary search tree problem. 3 00:00:10,533 --> 00:00:14,714 That is, we understand how an optimal solution must be one of a relatively 4 00:00:14,714 --> 00:00:18,500 small number of candidates. Let's compile that understanding into a 5 00:00:18,500 --> 00:00:21,100 polynomial time dynamic programming algorithm. 6 00:00:22,600 --> 00:00:26,826 Let me quickly remind you of the Optimal Substructure Lemma that we proved in the 7 00:00:26,826 --> 00:00:31,077 previous video. Suppose we have an optimal binary search 8 00:00:31,077 --> 00:00:35,153 tree for a given set of keys, one through N, with given probabilities. 9 00:00:35,153 --> 00:00:38,197 And suppose this binary search tree has the root R. 10 00:00:38,197 --> 00:00:40,833 Well then it has two sub-trees, t1 and t2. 11 00:00:40,833 --> 00:00:44,922 By the search tree property, we know exactly the population of each of those 12 00:00:44,922 --> 00:00:47,881 two sub-trees. T1 has to contain the keys one through r 13 00:00:47,881 --> 00:00:50,409 - 1. As usual we're sorted, we're assuming 14 00:00:50,409 --> 00:00:53,798 these are in sorted order. Whereas the right sub-tree t2, has to 15 00:00:53,798 --> 00:00:58,210 contain exactly the keys r + 1 through n. Moreover, t1 and t2 are, in their own 16 00:00:58,210 --> 00:01:01,438 right, valid search trees for these two sets of keys. 17 00:01:01,438 --> 00:01:06,010 And finally, and this is what we proved in the last video they're optimal for 18 00:01:06,010 --> 00:01:09,615 their respective sub-problems. T1 is optimal for keys one through r 19 00:01:09,615 --> 00:01:12,090 minus one and the corresponding weights or. 20 00:01:12,090 --> 00:01:17,860 Abilities and T2 is optimal for R plus one through N and their corresponding 21 00:01:17,860 --> 00:01:20,923 frequencies. So let's now execute our dynamic 22 00:01:20,923 --> 00:01:24,154 programming recipe. So now that we understand the way in 23 00:01:24,154 --> 00:01:28,250 which an optimal solution must necessarily be composed in a simple way 24 00:01:28,250 --> 00:01:32,519 from solutions to smaller subproblems. Let's take a step back, and ask, well. 25 00:01:32,519 --> 00:01:36,846 Given that, at the end of the day, we care about the optimal solution to the 26 00:01:36,846 --> 00:01:39,615 original problem. Which subproblems are relevant? 27 00:01:39,615 --> 00:01:42,500 Which subproblems are we going to be forced to solve? 28 00:01:42,500 --> 00:01:46,484 For example, with independent sets in line graphs we observed that to solve a 29 00:01:46,484 --> 00:01:50,728 subproblem we needed to know the answers to the subproblems where we pluck either 30 00:01:50,728 --> 00:01:53,160 one or two vertices off of the right hand side. 31 00:01:53,160 --> 00:01:57,093 So overall what we cared about was subproblems corresponding to prefixes of 32 00:01:57,093 --> 00:01:59,525 the graph. In the knapsack problem we needed to 33 00:01:59,525 --> 00:02:03,665 understand subproblems that involved one less item and possibly a resus, reduced 34 00:02:03,665 --> 00:02:07,340 residual knapsack capacity, so that led to us caring about solutions to 35 00:02:07,340 --> 00:02:11,014 subproblems corresponding to all prefixes of the items and all integer 36 00:02:11,014 --> 00:02:13,808 possibilities for the residual capacity of a knapsack. 37 00:02:13,808 --> 00:02:16,500 In sequence alignment, when we looked at subproblems. 38 00:02:16,500 --> 00:02:20,196 As we were plucking a character off of one or possibly both of the strings. 39 00:02:20,196 --> 00:02:23,696 So we cared about subproblems corresponding to prefixes of each of the 40 00:02:23,696 --> 00:02:26,061 two strings. Now, here's one of the things that's 41 00:02:26,061 --> 00:02:29,906 interesting about the binary search tree problem which we haven't seen before. 42 00:02:29,906 --> 00:02:33,588 Is that, when we look at a subproblem. In the optimal structure lemma, there's 43 00:02:33,588 --> 00:02:36,732 two that we might consider. We don't just pluck off from the right. 44 00:02:36,732 --> 00:02:39,686 We care about both the subproblem induced by the left subtree. 45 00:02:39,686 --> 00:02:43,259 And that induced by the right subtree. In the first case, we're looking at a 46 00:02:43,259 --> 00:02:46,832 prefix of the items we started with. And that's like we've seen in our many 47 00:02:46,832 --> 00:02:49,119 examples. But in the second case, the sub problem 48 00:02:49,119 --> 00:02:52,314 corresponding to t sub two. That's actually a suffix of the items 49 00:02:52,314 --> 00:02:55,761 that we started with. So put differently, the sub-problems we 50 00:02:55,761 --> 00:03:00,587 care about are those that can be obtained by either throwing away a prefix from the 51 00:03:00,587 --> 00:03:04,953 items that we started with or throwing away a suffix from the items that we 52 00:03:04,953 --> 00:03:09,229 started with. So in light of this observation, that the 53 00:03:09,229 --> 00:03:13,695 value of an optimal solution depends only immediately on sub problems that you 54 00:03:13,695 --> 00:03:18,217 obtain by throwing out a prefix with a, or a suffix of the items, what I want you 55 00:03:18,217 --> 00:03:22,682 to think about on this quiz is, what is the entire set of relevant sub problems? 56 00:03:22,682 --> 00:03:26,639 That is, for which subsets s of the original items one through n is it 57 00:03:26,639 --> 00:03:31,274 important that we compute the value of an optimal binary search tree on the items 58 00:03:31,274 --> 00:03:36,298 only in s? So before I explain the correct answer 59 00:03:36,298 --> 00:03:40,000 which is the third one, let me talk a little bit about a very natural but 60 00:03:40,000 --> 00:03:43,956 incorrect answer, namely the second one. Indeed, the second answer seems to have 61 00:03:43,956 --> 00:03:46,898 the best correspondence to the optimal substructure lemma. 62 00:03:46,898 --> 00:03:50,549 The optimal substructure lemma states that the optimal solution must be 63 00:03:50,549 --> 00:03:54,556 composed of an optimal solution on some prefix and an optimal solution on some 64 00:03:54,556 --> 00:03:58,512 suffix, united under a common root r. So we definitely care about the solutions 65 00:03:58,512 --> 00:04:02,570 to all prefixes and suffixes of the items but we care about more than just that. 66 00:04:02,570 --> 00:04:06,208 So maybe the easiest way to see that is to think about the recursive application 67 00:04:06,208 --> 00:04:09,621 of the optimal substructure lemma. And again relevant subproblems at the end 68 00:04:09,621 --> 00:04:13,214 of the day are going to correspond to all of the different distinct subproblems 69 00:04:13,214 --> 00:04:16,178 that ever get solved over the entire trajectory of this recursive 70 00:04:16,178 --> 00:04:18,604 implementation. So, I mean just think about one sort of 71 00:04:18,604 --> 00:04:20,445 example path in the recursion tree, right? 72 00:04:20,445 --> 00:04:23,859 So in the outermost level recursion you've got the whole item set, let's say 73 00:04:23,859 --> 00:04:27,003 there's 100 items one through 100, you're going through and trying all 74 00:04:27,003 --> 00:04:29,967 possibilities of the root. So at some point you're trying out root 75 00:04:29,967 --> 00:04:32,976 number 23 to see how it does. You have to recurse once on items one 76 00:04:32,976 --> 00:04:36,832 through 22 to optimally build a search tree for them, and similarly for items 24 77 00:04:36,832 --> 00:04:39,507 through 100. Now let's sort of drill down into this 78 00:04:39,507 --> 00:04:43,021 first recursive call where you recurse on item just one through 22. 79 00:04:43,021 --> 00:04:47,112 Now here again, you're going to be trying all possibilities of the route, those 22 80 00:04:47,112 --> 00:04:49,472 choices. At some point you'll be trying route 81 00:04:49,472 --> 00:04:52,304 number seventeen. There's again going to be two recursive 82 00:04:52,304 --> 00:04:54,822 calls. And the second recursive call is going to 83 00:04:54,822 --> 00:04:58,284 be on items eighteen through 22. It's going to be the items that were 84 00:04:58,284 --> 00:05:01,853 passed through this recursive call. A prefix of the original items. 85 00:05:01,853 --> 00:05:06,214 And then the second recursive call here, locally is going to be on some suffix of 86 00:05:06,214 --> 00:05:08,842 that prefix. So in this case, the items eighteen 87 00:05:08,842 --> 00:05:11,470 through 22. A suffix of the original prefix, one 88 00:05:11,470 --> 00:05:14,321 through 22. So, in general, as you think through this 89 00:05:14,321 --> 00:05:18,012 recursion multiple levels. At every step, what you've got going for 90 00:05:18,012 --> 00:05:22,205 you is, you're either deleting a chunk of items from the beginning, a prefix. 91 00:05:22,205 --> 00:05:24,945 Or you're deleting a chunk of items from the end. 92 00:05:24,945 --> 00:05:27,796 But you might be interleaving these two operations. 93 00:05:27,796 --> 00:05:32,046 So it is not true that you're always going to have a prefix of a suffix of the 94 00:05:32,046 --> 00:05:33,500 original set of items. But. 95 00:05:33,500 --> 00:05:36,886 What is true is that you will have some contiguous set of items. 96 00:05:36,886 --> 00:05:39,637 It's going to be. If you, if you have I as your smallest 97 00:05:39,637 --> 00:05:43,658 item in the subproblem and J is the biggest, you're going to have all of the 98 00:05:43,658 --> 00:05:47,098 subproblems in between. And that's because you were only plucking 99 00:05:47,098 --> 00:05:49,320 off items from the left or from the right. 100 00:05:49,320 --> 00:05:53,235 So that's why C is the correct answer. You need more subproblems than just 101 00:05:53,235 --> 00:05:57,237 prefixes and suffixes. Alright, so that was a little tricky, 102 00:05:57,237 --> 00:06:01,320 identifying the relevant sub problems but now that we've got them in our grubby 103 00:06:01,320 --> 00:06:05,301 little hands the dynamic programming algorithm as usual is just going to fall 104 00:06:05,301 --> 00:06:09,537 in to place, the relevant collection of sub problems unlocks the power in a very 105 00:06:09,537 --> 00:06:13,620 mechanical way of its entire paradigm. So let's now just fill in all the details. 106 00:06:13,620 --> 00:06:15,949 The first step is to formalize the recurrence. 107 00:06:15,949 --> 00:06:19,949 That is, the way in which the optimal solution of a given subproblem depends on 108 00:06:19,949 --> 00:06:23,494 the value of smaller subproblems. This is just going to be a mathematical 109 00:06:23,494 --> 00:06:27,241 formula which encodes what we already proved in the optional substructure 110 00:06:27,241 --> 00:06:29,469 lemma. And then we're going to use this formula 111 00:06:29,469 --> 00:06:32,862 to populate a table in a dynamic programming algorithm to solve, 112 00:06:32,862 --> 00:06:35,546 systematically, the values for all of the subproblems. 113 00:06:35,546 --> 00:06:38,990 So let's have some notation to put in our recurrence, in our formula. 114 00:06:38,990 --> 00:06:43,142 We're going to be indexing sub-problems with two indices I and J and this is 115 00:06:43,142 --> 00:06:47,464 because we have two degrees of freedom where the continuous interval of item 116 00:06:47,464 --> 00:06:50,775 starts I, and where the continuous interval of items ends, J. 117 00:06:50,775 --> 00:06:54,647 So for a given choice I and J, where of course I should be the most J. 118 00:06:54,647 --> 00:06:58,912 I'm going to denote by capital C sub IJ, the weighted search cost of an optimal 119 00:06:58,912 --> 00:07:02,841 binary search tree just in the contiguous set of in, items from I to J. 120 00:07:02,841 --> 00:07:06,994 And of course, the weights or the probabilities are exactly the same as in 121 00:07:06,994 --> 00:07:10,530 the original problem they're just inherited here, PI through PJ. 122 00:07:10,530 --> 00:07:14,827 So now let's state the recurrence. So, for a given sub problem cij, we're 123 00:07:14,827 --> 00:07:19,549 going to express the value of an optimal binary search tree in terms of those of 124 00:07:19,549 --> 00:07:23,302 smaller sub problems. The optimal sub structure lemma tells us 125 00:07:23,302 --> 00:07:26,493 how to do this. The optimal substructure lemma says, that 126 00:07:26,493 --> 00:07:30,782 if we knew the roots, if we know the choice of the root r which here is going 127 00:07:30,782 --> 00:07:34,792 to be somewhere between the items I and j, then in that case, the optimal 128 00:07:34,792 --> 00:07:39,303 solution has to be composed of optimal solutions to the two smaller sub-problems 129 00:07:39,303 --> 00:07:42,478 united under the root. Now we don't know what the root is. 130 00:07:42,478 --> 00:07:46,432 There's a j - I + one possibilities. It could be anything between I and j 131 00:07:46,432 --> 00:07:49,272 inclusive. So as usual, we're just going to do brute 132 00:07:49,272 --> 00:07:53,672 force search over the relatively small set of candidates that we've identified. 133 00:07:53,672 --> 00:07:57,460 So brute force search we encode by just explicitly having a minimum. 134 00:07:57,460 --> 00:08:05,090 So I choose some route R somewhere between I and J inclusive. 135 00:08:05,090 --> 00:08:10,981 And given a choice of R we're going to inherit the weighted search cost of the 136 00:08:10,981 --> 00:08:15,977 optimal solution on just the prefix of items I through R minus one. 137 00:08:15,977 --> 00:08:19,585 So on our notation that would be C of I. R minus one. 138 00:08:19,585 --> 00:08:23,913 Similarly we pick up the weighted search cost of an optimal solution to the suffix 139 00:08:23,913 --> 00:08:27,510 of items R plus one through J. And if you go back to our proof of the 140 00:08:27,510 --> 00:08:31,473 optimal substructure lemma you'll see we did a calculation which gives us a 141 00:08:31,473 --> 00:08:35,591 formula for what, how the weighted search cost of a tree depends on that of its 142 00:08:35,591 --> 00:08:38,094 subtrees. And in addition to the weighted search 143 00:08:38,094 --> 00:08:41,952 cost contributed by each of the two search trees we pick up a constant, 144 00:08:41,952 --> 00:08:45,602 namely the sum of all of the probabilities in the items we're working 145 00:08:45,602 --> 00:08:47,010 with. So here that's sum of. 146 00:08:47,010 --> 00:08:51,151 P sub K, where K ranges from the first item in the sub problem I to the last 147 00:08:51,151 --> 00:08:54,638 item in the sub problem J. So one extra edge case we should deal 148 00:08:54,638 --> 00:08:59,051 with is if we choose the root to be the first item, then the first recursive term 149 00:08:59,051 --> 00:09:02,974 doesn't make sense, then we'll have C, I, I minus one, which is not defined. 150 00:09:02,974 --> 00:09:07,388 Similarly, if we choose the root to be J, then this last term would be C of J plus 151 00:09:07,388 --> 00:09:10,820 1J which is not defined. Remember indices are supposed to be in 152 00:09:10,820 --> 00:09:13,163 order. So in that case, we'll just interpret 153 00:09:13,163 --> 00:09:16,596 these capital C's as zero. And so why is the recurrence correct? 154 00:09:16,596 --> 00:09:21,010 Well all of the heavy lifting was done and our proof of the optimal substructure 155 00:09:21,010 --> 00:09:22,622 lemma. What did we prove there? 156 00:09:22,622 --> 00:09:26,704 We proved the optimal solution has to be one of just J minus I plus one possible 157 00:09:26,704 --> 00:09:28,922 things. It depends only on the choice of the 158 00:09:28,922 --> 00:09:31,089 root. Given the root, the rest is determined 159 00:09:31,089 --> 00:09:33,357 for us. The recurrency is by definition doing 160 00:09:33,357 --> 00:09:36,078 brute force search through the only set of candidates. 161 00:09:36,078 --> 00:09:39,858 So therefore, it is indeed a correct formula for the optimal solution value, 162 00:09:39,858 --> 00:09:42,580 in terms of optimal solutions to smaller sub problems.