1 00:00:01,760 --> 00:00:06,535 So now that we've done some work with our thought experiment, understanding exactly 2 00:00:06,535 --> 00:00:11,080 what the optimal solution, the maximum rate independent set had to look like in 3 00:00:11,080 --> 00:00:15,680 the path graph, let's turn that work into a linear time algorithm. 4 00:00:15,680 --> 00:00:19,810 Let me quickly remind you, the bottom line from the previous video. 5 00:00:19,810 --> 00:00:22,773 So we argued two things. First of all, if the max weight 6 00:00:22,773 --> 00:00:27,084 independent set of a path graph happens to exclude the rightmost vertex, then it 7 00:00:27,084 --> 00:00:30,963 has to in fact be a max weight independent set with the smaller graph g 8 00:00:30,963 --> 00:00:34,142 prime, obtained from g by plucking off the rightmost vertex. 9 00:00:34,142 --> 00:00:37,752 If on the other hand a max weight independent set does include the 10 00:00:37,752 --> 00:00:41,524 rightmost vertex v sub n, then if you take out v sub n and look at the 11 00:00:41,524 --> 00:00:45,512 remaining part of the independent set, that has to be max weight among the 12 00:00:45,512 --> 00:00:49,822 smaller graph g double prime, defined by plucking the two rightmost vertices off 13 00:00:49,822 --> 00:00:53,163 of the original graph g. Ergo, if we happen to know, if a little 14 00:00:53,163 --> 00:00:57,556 birdie told us which of these two cases we were in we could recursively compute 15 00:00:57,556 --> 00:01:01,611 the optimal solution of either G prime or G double prime as appropriate and be 16 00:01:01,611 --> 00:01:03,920 done. If we recurse on G prime we just return 17 00:01:03,920 --> 00:01:06,281 the result. If we recurse on G double prime we 18 00:01:06,281 --> 00:01:08,540 augment it by V sub N and return the result. 19 00:01:08,540 --> 00:01:12,200 Now, there is no little birdie, we don't know which of the two cases, we're in. 20 00:01:12,200 --> 00:01:15,957 So we're concluded the previous video by proposing just trying both case. 21 00:01:15,957 --> 00:01:19,811 So let's write down what that proposed algorithm would look like before we take 22 00:01:19,811 --> 00:01:31,127 a step back and critique it. So, the good news about this algorithm is 23 00:01:31,127 --> 00:01:33,622 it is correct. It's guaranteed to return the maximum 24 00:01:33,622 --> 00:01:36,549 weight independence set. So, how would you prove this formally? 25 00:01:36,549 --> 00:01:39,860 Well, it would be by induction, in exactly the same way we proceeded with 26 00:01:39,860 --> 00:01:43,122 divide and conquer algorithms. And for the same reason, I'm not going to 27 00:01:43,122 --> 00:01:46,433 talk about the details here. If you're interested, I'll leave it as an 28 00:01:46,433 --> 00:01:48,592 optional exercise to write this out, formally. 29 00:01:48,592 --> 00:01:52,479 But intuitively, the inductive hypothesis guarantees correctness of our recursive 30 00:01:52,479 --> 00:01:54,734 calls. So computing the maximum weight solution 31 00:01:54,734 --> 00:01:58,141 in either G prime or G double prime, and then, in the previous video, the 32 00:01:58,141 --> 00:02:01,677 whole point of that, in effect, was arguing the correctness of our inductive 33 00:02:01,677 --> 00:02:06,217 step, given the correct solution to the sub-problem, we argued what has to be the 34 00:02:06,217 --> 00:02:09,543 optimal way to extend it to a solution, to the original graph, G. 35 00:02:09,543 --> 00:02:13,423 The bad news, on the other hand, is that this algorithm takes exponential time. 36 00:02:13,423 --> 00:02:15,993 It's essentially no better than brute force search. 37 00:02:15,993 --> 00:02:20,276 So while the correctness of this kind of recursive algorithm, follows the template 38 00:02:20,276 --> 00:02:22,393 of divide and conquer pretty much exactly. 39 00:02:22,393 --> 00:02:24,761 the running time is blown up to exponential. 40 00:02:24,761 --> 00:02:28,793 And the reason for that difference is, in our divide an conquer algorithms, think 41 00:02:28,793 --> 00:02:32,471 of Merge Sort as a canonical example, we made a ton of progress before we 42 00:02:32,471 --> 00:02:33,227 recursed. Right? 43 00:02:33,227 --> 00:02:36,932 We threw out half of the array, 50% of the stuff before we bothered with 44 00:02:36,932 --> 00:02:39,824 any recursion. How much progress are we making in this 45 00:02:39,824 --> 00:02:41,263 algorithm? Well, very little. 46 00:02:41,263 --> 00:02:45,407 It's positive progress, but very small. We throw out either one or two vertices 47 00:02:45,407 --> 00:02:49,285 out of maybe this graph with say a million vertices before recursing. 48 00:02:49,285 --> 00:02:53,483 So we're branching by a factor two and making very little progress before each 49 00:02:53,483 --> 00:02:55,767 branch. That's why we give this exponential 50 00:02:55,767 --> 00:02:59,540 running time rather than something more in the neighborhood of n log n. 51 00:02:59,540 --> 00:03:02,029 So this brings us to the following question. 52 00:03:02,029 --> 00:03:05,933 This is an important question. I want you to think about it carefully 53 00:03:05,933 --> 00:03:08,875 before you respond. So we have this exponential time 54 00:03:08,875 --> 00:03:12,270 algorithm, it makes an exponential number of recursive calls. 55 00:03:12,270 --> 00:03:16,909 Each recursive call is handed some graph, for which it's responsible for computing 56 00:03:16,909 --> 00:03:21,153 the maximum-weight independence set. The question is this. Over all of these 57 00:03:21,153 --> 00:03:24,887 exponentially many different sub-problems, each of which is passed 58 00:03:24,887 --> 00:03:27,490 some graph as a sub-problem, how many distinct, 59 00:03:27,490 --> 00:03:33,913 How many fundamentally different sub problems are ever considered across these 60 00:03:33,913 --> 00:03:44,016 exponentially many recursive calls? So the answer to this question, and the 61 00:03:44,016 --> 00:03:51,260 key to unlock the crazy efficiency of our dynamic programming implementation, is B. 62 00:03:51,260 --> 00:03:56,313 So despite the fact that there's an exponential number of recursive calls, we 63 00:03:56,313 --> 00:04:00,053 only ever solve a linear number of distinct sub-problems. 64 00:04:00,053 --> 00:04:05,369 In fact, we can explicitly say what are the different sub-problems it could solve 65 00:04:05,369 --> 00:04:09,110 throughout the recursion. What happens before you recurse? 66 00:04:09,110 --> 00:04:13,355 Well you pluck vertices from the graph you were given off from the right. 67 00:04:13,355 --> 00:04:17,715 Maybe you pluck off one vertex that's in the first recursive call, or in the 68 00:04:17,715 --> 00:04:21,904 second recursive call you pluck off two vertices, but both from the right. 69 00:04:21,904 --> 00:04:26,608 So an invariant maintains throughout the recursion is that the sub-problem you're 70 00:04:26,608 --> 00:04:30,854 handed was obtained by successive plucking off of vertices from the right 71 00:04:30,854 --> 00:04:34,009 part of the graph, from the end of the, end of the graph. 72 00:04:34,009 --> 00:04:38,427 So, however you got to where you are, whatever the sequence of recursive calls 73 00:04:38,427 --> 00:04:42,615 led to where you are now, you are guaranteed to be handed a prefix of the 74 00:04:42,615 --> 00:04:44,831 graph. The graph induced by the first I 75 00:04:44,831 --> 00:04:48,121 vertices, where I is some number between zero and n. 76 00:04:48,121 --> 00:04:53,347 So therefore there's only a linear number of sub-problems you ever have to worry 77 00:04:53,347 --> 00:04:56,380 about, the prefixes of the original input graph. 78 00:04:56,380 --> 00:05:02,153 From this, we conclude that the exponential running time of the previous 79 00:05:02,153 --> 00:05:08,166 algorithm arises solely from the spectacular redundancy of solving exactly 80 00:05:08,166 --> 00:05:13,940 the same sub-problem from scratch, over and over and over and over again. 81 00:05:13,940 --> 00:05:18,361 So this observation offers up the promise of a linear time implementation of this 82 00:05:18,361 --> 00:05:20,896 algorithm. After all, there's no need to solve a 83 00:05:20,896 --> 00:05:24,508 sub-problem more than once. Once you've solved it once you know the 84 00:05:24,508 --> 00:05:26,773 answer. So an obvious way to speed up this 85 00:05:26,773 --> 00:05:30,763 algorithm, to speed it up dramatically is to simply cache the results of a 86 00:05:30,763 --> 00:05:34,862 sub-problem the first time you see it. Then you can look it up in some array, 87 00:05:34,862 --> 00:05:38,620 constant time, at any point later on in the algorithm. 88 00:05:38,620 --> 00:05:42,463 There is a word for this, I won't really use it in this class, but just so you 89 00:05:42,463 --> 00:05:44,710 that know what it is, it's called memoization. 90 00:05:44,710 --> 00:05:48,679 So in case this is a little vague, what I mean is you would have some array, some 91 00:05:48,679 --> 00:05:52,698 global array, indexed one to N or maybe zero to N with the semantics that the Ith 92 00:05:52,698 --> 00:05:56,271 entry of this array, is the value of the solution of the Ith sub-problem. 93 00:05:56,271 --> 00:06:00,190 Now when you do a recursive call and you're handed the first I vertices of the 94 00:06:00,190 --> 00:06:04,160 graph, and again remember that we know that the sub-problem has to look like the 95 00:06:04,160 --> 00:06:08,179 first I vertices of the graph for sub I. You check the array, if it's already been 96 00:06:08,179 --> 00:06:10,511 filled in, if you already know the answer, great. 97 00:06:10,511 --> 00:06:13,786 You just return it and count the time, you don't bother to resolve. 98 00:06:13,786 --> 00:06:15,920 If this is the first time you've ever seen. 99 00:06:15,920 --> 00:06:19,143 this sub problem then you recurse and you solve it, 100 00:06:19,143 --> 00:06:21,950 as as we saw, as we suggested in the previous slot. 101 00:06:21,950 --> 00:06:26,307 So with this simple memoization fixed, this action, this algorithm is linear 102 00:06:26,307 --> 00:06:28,980 time. The easiest way to see that, and actually 103 00:06:28,980 --> 00:06:33,337 in fact a better implementation, is to go away from this recursive top down 104 00:06:33,337 --> 00:06:36,998 paradigm, and instead implement the solution in a bottom up way. 105 00:06:36,998 --> 00:06:41,530 So solving sub problems in a principled way from the smallest to the original 106 00:06:41,530 --> 00:06:46,862 one, the biggest. So a little bit more precisely, 107 00:06:46,862 --> 00:06:51,832 let me use the notation G sub I to denote the sub graph of the original graph, 108 00:06:51,832 --> 00:06:56,417 consisting of the first I vertices. So we're going to again going to ha-, 109 00:06:56,417 --> 00:07:00,221 going to have an array with the same semantics as in the recursive version. 110 00:07:00,221 --> 00:07:03,285 The Ith entry denotes the solution to the Ith sub-problem. 111 00:07:03,285 --> 00:07:07,565 That is the max rate independent set of G sub I, and the plan is to populate that 112 00:07:07,565 --> 00:07:15,281 bottom up, that is from left to right. So we'll handle the edge cases of the 113 00:07:15,281 --> 00:07:19,349 first two entries of, of this array specially G sub zero would be the empty 114 00:07:19,349 --> 00:07:22,929 graph, so there's no independent set. So lets define the max weight, 115 00:07:22,929 --> 00:07:25,315 independent set of the map graph, to be zero. 116 00:07:25,315 --> 00:07:29,654 And, if you graph in G one, where the only vertex is v one, the max weight 117 00:07:29,654 --> 00:07:32,095 independent set is obviously that one vertex. 118 00:07:32,095 --> 00:07:40,244 Remember weights are not negative. So our main four loop just builds up 119 00:07:40,244 --> 00:07:44,659 solutions to all of the sub-problems in a systematic way, going from smallest 120 00:07:44,659 --> 00:07:48,386 graph, G sub two up to the biggest graph, the original one, G sub n. 121 00:07:48,386 --> 00:07:52,974 And when we consider the graph G sub I, how do we figure out what the max weight 122 00:07:52,974 --> 00:07:57,217 independence set is of that graph? Well now we use, completely directly, the 123 00:07:57,217 --> 00:08:01,460 work that we put in the previous video. The previous video told us what an 124 00:08:01,460 --> 00:08:05,878 optimal solution has to look like. And it has to have one of two forms. 125 00:08:05,878 --> 00:08:11,070 We know, we proved, either, the maximum independent set of G sub I. 126 00:08:11,070 --> 00:08:16,004 Excludes the last vertex V sub I, and then is merely an optimal solution of the 127 00:08:16,004 --> 00:08:19,939 graph G sub I minus one. If it's not that, there is nothing else 128 00:08:19,939 --> 00:08:23,375 it can be, other than including the last vertex V sub I. 129 00:08:23,375 --> 00:08:27,872 And being an optimal max weighted independent set for the graph G sub I 130 00:08:27,872 --> 00:08:30,745 minus two. We know its one of those two things. 131 00:08:30,745 --> 00:08:34,431 We don't know which. We do brute force search among the two 132 00:08:34,431 --> 00:08:37,741 possibilities, and that gives us the optimal solution 133 00:08:37,741 --> 00:08:41,739 for the Ith sub problem. Crucially, when we need to do this brute 134 00:08:41,739 --> 00:08:45,050 force search for the Ith sub problem, we already know. 135 00:08:45,050 --> 00:08:49,281 We've already computed the optimal solutions to the smaller sub problems 136 00:08:49,281 --> 00:08:53,701 that are relevant, those can be looked up in constant time, and that's what makes 137 00:08:53,701 --> 00:08:56,890 this, each iteration of this four loop run in constant time. 138 00:08:56,890 --> 00:08:59,983 So we've done a fair amount of work to get to this point, 139 00:08:59,983 --> 00:09:03,727 but, after seeing that the greedy algorithm design paradigm failed us. 140 00:09:03,727 --> 00:09:07,200 The divide-and-conquer algorithm design paradigm was inadequate. 141 00:09:07,200 --> 00:09:10,564 Brute force search is too slow. With this, as we'll see, dynamic 142 00:09:10,564 --> 00:09:14,742 programming algorithm, we now have a one line solution, effectively, to the max 143 00:09:14,742 --> 00:09:17,238 weight independent set problem in path graphs. 144 00:09:17,238 --> 00:09:18,975 Pretty cool. What's the run time? 145 00:09:18,975 --> 00:09:23,478 Well this is probably the easiest algorithm is run time we've ever had to 146 00:09:23,478 --> 00:09:25,378 analyze. Obviously, it's linear time, 147 00:09:25,378 --> 00:09:27,820 constant time per each iteration of the four loop. 148 00:09:27,820 --> 00:09:30,877 Why is the algorithm correct? Well it's as same as our recursive 149 00:09:30,877 --> 00:09:33,170 algorithm. It makes exactly the same decisions. 150 00:09:33,170 --> 00:09:36,800 The only difference is it doesn't bother with the spectacular redundancy of 151 00:09:36,800 --> 00:09:39,046 resolving sub-problems it's already solved. 152 00:09:39,046 --> 00:09:42,437 Again if you wanted to prove it by scratch, it would just be a straight 153 00:09:42,437 --> 00:09:46,211 forward induction, like in our divide and conquer algorithm, the recursive calls 154 00:09:46,211 --> 00:09:49,985 are correct by the inductive hypothesis. The inductive step is justified by the 155 00:09:49,985 --> 00:09:52,040 case analysis of the of the previous video.