1 00:00:02,660 --> 00:00:06,087 Today, we're going to embark on the discussion of a new algorithm design 2 00:00:06,087 --> 00:00:08,471 paradigm. Namely, that of designing and analyzing 3 00:00:08,471 --> 00:00:11,451 greedy algorithms. So to put this study of greedy algorithms 4 00:00:11,451 --> 00:00:13,785 in a little bit of context, let's just zoom out. 5 00:00:13,785 --> 00:00:17,957 Let's both review some of the algorithm design paradigms that we've already seen, 6 00:00:17,957 --> 00:00:21,682 as well as look forward to some that we're going to learn later on, in this 7 00:00:21,682 --> 00:00:24,193 course. So it's sort of a sad fact of life that 8 00:00:24,193 --> 00:00:26,844 in algorithm design, there's no one silver bullet. 9 00:00:26,844 --> 00:00:30,955 There's no magic potion that's the cure for all your computational problems. 10 00:00:30,955 --> 00:00:35,012 So instead, the best we can do, and the focus of these courses, is to discuss 11 00:00:35,012 --> 00:00:39,502 general techniques that apply to lots of different problems that arise and lots of 12 00:00:39,502 --> 00:00:42,747 different domains. So that's what I mean by algorithm design 13 00:00:42,747 --> 00:00:45,344 paradigms. High level problem solving strategies 14 00:00:45,344 --> 00:00:51,080 that cut across multiple applications. So let's look at some examples. 15 00:00:51,080 --> 00:00:54,752 Back in part one, we started with the divide and conquer algorithm design 16 00:00:54,752 --> 00:00:56,966 paradigm. A canonical example of that paradigm 17 00:00:56,966 --> 00:01:00,488 being the merge sort algorithm. So remember in divide and conquer what 18 00:01:00,488 --> 00:01:04,564 you do is you take your problem you break it into smaller sub problems, you solve 19 00:01:04,564 --> 00:01:08,690 the sub problems recursively and then you combine the results into a solution for 20 00:01:08,690 --> 00:01:11,658 the original problem. Like how in merge sort you recursively 21 00:01:11,658 --> 00:01:15,583 sort two sub arrays and then merge the results to get a sorted version of the 22 00:01:15,583 --> 00:01:19,311 original input array. Another paradigm that we touched on in 23 00:01:19,311 --> 00:01:21,909 part one, although we didn't discuss it anywhere 24 00:01:21,909 --> 00:01:24,723 near as thoroughly, is that of randomized algorithms. 25 00:01:24,723 --> 00:01:27,322 So the idea that you could have code flip coins, 26 00:01:27,322 --> 00:01:30,245 that is, make random choices, inside the code itself. 27 00:01:30,245 --> 00:01:34,034 Often, this leads to simpler, more practical, or more elegant algorithms. 28 00:01:34,034 --> 00:01:38,418 A canonical application here is the quick sort algorithm using a random pivot 29 00:01:38,418 --> 00:01:41,233 element. But we also saw applications for example, 30 00:01:41,233 --> 00:01:45,184 to the design of hash functions. So the next measure paradigm we're going 31 00:01:45,184 --> 00:01:49,569 to discuss is that of greedy algorithms. So these are algorithms that iteratively 32 00:01:49,569 --> 00:01:54,339 make myopic decisions. In fact, we've already seen an example of 33 00:01:54,339 --> 00:01:58,624 a greedy algorithm in part one namely Dijkstra's shortest path algorithm. 34 00:01:58,624 --> 00:02:02,968 And then, the final paradigm we're going to discuss in this class is that of 35 00:02:02,968 --> 00:02:06,598 dynamic programming, a very powerful paradigm which solves, in 36 00:02:06,598 --> 00:02:11,180 particular, two of the motivating questions we saw earlier namely, sequence 37 00:02:11,180 --> 00:02:14,300 alignment, and distributed shortest paths. 38 00:02:14,300 --> 00:02:18,159 So what is a greedy algorithm anyways? Well, to be honest, I'm not going to 39 00:02:18,159 --> 00:02:21,649 offer you a formal definition. In fact, much blood and ink has been 40 00:02:21,649 --> 00:02:24,822 spilled over which algorithm is precisely greedy algorithms. 41 00:02:24,822 --> 00:02:27,413 But, I'll give you a sort of informal description. 42 00:02:27,413 --> 00:02:31,061 A sort of rule of thumb for what greedy algorithms usually look like. 43 00:02:31,061 --> 00:02:34,762 Generally speaking, what a greedy algorithm does, is make a sequence of 44 00:02:34,762 --> 00:02:37,459 decisions with each decision being made myopically. 45 00:02:37,459 --> 00:02:41,689 That is, it seems like a good idea at the time and then you hope that everything 46 00:02:41,689 --> 00:02:44,589 works out at the end. The best way to get a feel for greedy 47 00:02:44,589 --> 00:02:48,310 algorithms is to see examples and the upcoming lectures will give you a number 48 00:02:48,310 --> 00:02:50,523 of them. But I want to point out we've actually 49 00:02:50,523 --> 00:02:53,961 already seen an example of a greedy algorithm in part one of this course, 50 00:02:53,961 --> 00:02:56,480 namely Dijkstra's shortest path algorithm. 51 00:02:56,480 --> 00:02:59,601 So in what sense is Dijkstra's algorithm a greedy algorithm? 52 00:02:59,601 --> 00:03:03,450 Well if you recall the psuedo code for Dijkstra's algorithm, you'll recall 53 00:03:03,450 --> 00:03:07,144 there's one main wild loop and the algorithm process's exactly one new 54 00:03:07,144 --> 00:03:11,617 destination vertex in each iteration of this wild loop, so there's exactly N - 1 55 00:03:11,617 --> 00:03:14,375 iterations overall, where N is the number of vertices. 56 00:03:14,375 --> 00:03:18,276 So the algorithm only gets one shot to compute the shortest path to a given 57 00:03:18,276 --> 00:03:22,178 destination. It never goes back and revisits the decision, in that sense the 58 00:03:22,178 --> 00:03:26,079 decisions are myoptic, irrevocable and that's the sense in which Dijkstra's 59 00:03:26,079 --> 00:03:29,309 algorithm is greedy. So let me pause for a moment to discuss 60 00:03:29,309 --> 00:03:31,553 the greedy algorithm design paradigm generally. 61 00:03:31,553 --> 00:03:34,990 Probably this discussion will seem a little abstract so I recommend you 62 00:03:34,990 --> 00:03:38,570 revisit this discussion on the slide after we've seen a few examples so at 63 00:03:38,570 --> 00:03:40,766 that point I think it will really hit home. 64 00:03:40,766 --> 00:03:44,346 So let me proceed by comparing it and contrasting it to the paradigm we've 65 00:03:44,346 --> 00:03:47,293 already studied in depth. That of divide and conquer algorithms. 66 00:03:47,293 --> 00:03:50,752 So you'll recall that in a divide and conquer algorithm what you do is, you 67 00:03:50,752 --> 00:03:54,305 break the problem into sub-problems. So, maybe you take an input array and you 68 00:03:54,305 --> 00:03:57,488 split it into two sub-arrays. Then you solve the smaller sub-problems 69 00:03:57,488 --> 00:04:00,809 recursively, and then you combine the results of the sub-problems into a 70 00:04:00,809 --> 00:04:04,177 solution to the original input. So the greedy paradigm is quite different 71 00:04:04,177 --> 00:04:07,140 in several respects. First, both a strength and a weakness of 72 00:04:07,140 --> 00:04:10,608 the greedy algorithm design paradigm is just how easy it is to apply. 73 00:04:10,608 --> 00:04:14,434 So it's often quite easy to come up with plausible greedy algorithms for a 74 00:04:14,434 --> 00:04:17,545 problem, even multiple difference plausible greedy algorithms. 75 00:04:17,545 --> 00:04:21,014 I think that a point of contrast with divide and conquer algorithms. 76 00:04:21,014 --> 00:04:24,839 Often it's tricky to come up with a plausible divide and conquer algorithm, 77 00:04:24,839 --> 00:04:28,716 and usually you have this eureka moment where you finally figure out how to 78 00:04:28,716 --> 00:04:32,541 decompose the problem in the right way. And once you have the eureka moment, 79 00:04:32,541 --> 00:04:35,782 you're good to go. So secondly, I'm happy to report that 80 00:04:35,782 --> 00:04:39,267 analyzing running time of greedy algorithms will generally be much easier 81 00:04:39,267 --> 00:04:41,480 than it was with divide and conquer algorithms. 82 00:04:41,480 --> 00:04:45,341 For divide and conquer algorithms it was really unclear whether they were fast or 83 00:04:45,341 --> 00:04:48,873 slow, because we had to understand the running time over multiple levels of 84 00:04:48,873 --> 00:04:51,181 recursion. On the one hand problems were size was 85 00:04:51,181 --> 00:04:54,477 getting smaller, but on the other hand, the number of some problems was 86 00:04:54,477 --> 00:04:56,832 proliferating. So we had to work hard, we developed 87 00:04:56,832 --> 00:05:00,317 these powerful tools like the master method, and some other techniques, for 88 00:05:00,317 --> 00:05:04,131 figuring out just how fast an algorithm like Merge Sort runs, or just how fast an 89 00:05:04,131 --> 00:05:07,333 algorithm like Strassen's fast matrix multiplication algorithm runs. 90 00:05:07,333 --> 00:05:10,990 In contrast with greedy algorithms, it will often be a one liner. 91 00:05:10,990 --> 00:05:14,810 Often it will be clear that the work is dominated by say, a sorting sub routine 92 00:05:14,810 --> 00:05:18,582 and of course we all know that sorting takes n log and time if you use a 93 00:05:18,582 --> 00:05:20,554 sensible algorithm for it. Now the catch, 94 00:05:20,554 --> 00:05:22,699 and this is the third point of comparison, 95 00:05:22,699 --> 00:05:26,580 is we're generally going to have to work much harder to understand correctness 96 00:05:26,580 --> 00:05:29,898 issues of greedy algorithms. For divide-and-conquer algorithms we 97 00:05:29,898 --> 00:05:33,830 didn't talk much about correctness. It was generally a pretty straightforward 98 00:05:33,830 --> 00:05:36,741 induction proof. You can review the lectures on Quicksort 99 00:05:36,741 --> 00:05:40,723 if you want an example of one of those canonical inductive correctness proofs. 100 00:05:40,723 --> 00:05:43,379 But the game totally changes with greedy algorithms. 101 00:05:43,379 --> 00:05:47,361 In fact, given a greedy algorithm we often won't even have very good intuition 102 00:05:47,361 --> 00:05:51,343 for whether or not they are correct. Let alone how to prove they're correct. 103 00:05:51,343 --> 00:05:55,300 So even with a correct algorithm, it's often hard to figure out, why it's 104 00:05:55,300 --> 00:05:58,305 correct. And in fact, if you remember only one 105 00:05:58,305 --> 00:06:02,383 thing from all of this greedy algorithm discussion many years from now, 106 00:06:02,383 --> 00:06:06,002 I hope one key thing you remember is they're often not correct. 107 00:06:06,002 --> 00:06:09,966 Often, especially if it's one you proposed yourself which you're very 108 00:06:09,966 --> 00:06:13,355 biased, in favor of. You will think the algorithm, the greedy 109 00:06:13,355 --> 00:06:16,227 algorithm must be correct because it's so natural. 110 00:06:16,227 --> 00:06:18,870 But many of them are not, so keep that in mind. 111 00:06:18,870 --> 00:06:22,999 So to give you some immediate practice with the ubiquitous incorrectness of 112 00:06:22,999 --> 00:06:25,987 natural algorithm. Let's review a point that we already 113 00:06:25,987 --> 00:06:29,084 covered in part one of this class concerning Dijkstra's algorithm. 114 00:06:29,084 --> 00:06:32,832 Now, in part one we made a big deal of what a justly famous algorithm Dijkstra's 115 00:06:32,832 --> 00:06:36,962 shortest path algorithm is, it runs brazenly fast and it computes all the 116 00:06:36,962 --> 00:06:38,972 shortest paths. What else do you want? 117 00:06:38,972 --> 00:06:42,721 Well remember there was an assumption when we proved that the Dijkstra's 118 00:06:42,721 --> 00:06:46,035 algorithm is correct. We assumed that every edge of the given 119 00:06:46,035 --> 00:06:50,327 network has a non negative length. We did not allow negative edge lengths. 120 00:06:50,327 --> 00:06:52,654 And as we discussed in part one, you know, 121 00:06:52,654 --> 00:06:56,512 for many applications, you only care about non negative edge lengths. 122 00:06:56,512 --> 00:07:00,313 But there are applications where you do want negative edge lengths. 123 00:07:00,313 --> 00:07:04,455 So let's review on this quiz why Dijkstra's is actually incorrect, despite 124 00:07:04,455 --> 00:07:07,575 being so natural. it's incorrect when edges can have 125 00:07:07,575 --> 00:07:11,093 negative lengths. So I've drawn in green, a very simple 126 00:07:11,093 --> 00:07:15,820 shortest path network with three edges and I've annotated the edges with their 127 00:07:15,820 --> 00:07:18,453 links. You'll notice one of those edges does 128 00:07:18,453 --> 00:07:22,401 have a negative length, the edge from V to W with length minus two. 129 00:07:22,401 --> 00:07:27,188 So the question is consider the source vertex S and the destination vertex W. 130 00:07:27,188 --> 00:07:31,795 And the question is, what is the shortest path distance computed by Dijkstra's 131 00:07:31,795 --> 00:07:36,641 algorithm and you may have to go and review just a pseudo code in part one or 132 00:07:36,641 --> 00:07:39,724 on the web. to answer that part of the question and 133 00:07:39,724 --> 00:07:44,517 then what is in fact the actual shortest path distance from S to W where as usual 134 00:07:44,517 --> 00:07:48,860 the length of a path is just the sum of the lengths of the edges in the path. 135 00:07:51,700 --> 00:07:58,075 All right, so the correct answer is D. So let's start with the second part of 136 00:07:58,075 --> 00:08:02,297 the question, what is the actual length of a shortest path from S to W when 137 00:08:02,297 --> 00:08:04,717 there's only two paths at all in the graph? 138 00:08:04,717 --> 00:08:09,108 The one straight from S to W that has length 2, and the one that goes by the 139 00:08:09,108 --> 00:08:12,767 intermediate point V that has length 3 + -21, = 1 which is shorter. 140 00:08:12,767 --> 00:08:16,820 So, SVW is the shortest path that has length 1. Why is Dijkstra incorrect? Well 141 00:08:16,820 --> 00:08:20,985 if you go back to the pseudo code of Dijkstra, you'll see that in the very 142 00:08:20,985 --> 00:08:25,658 first iteration it will greedily find the closest vertex to S in that case this is 143 00:08:25,658 --> 00:08:29,767 W, W is closer then V. It will greedily compute the shortest path distance to W 144 00:08:29,767 --> 00:08:33,976 knowing the information it has right now and all it knows is there's this one hot 145 00:08:33,976 --> 00:08:38,072 path from S to W, so it will irrevocably commute to the shortest path distance 146 00:08:38,072 --> 00:08:40,968 from S to W as 2. Never reconsidering that decision later. 147 00:08:40,968 --> 00:08:44,575 So Dijkstra will terminate with the incorrect output that the shortest path 148 00:08:44,575 --> 00:08:47,280 link from S to W is 2. This doesn't contradict anything we 149 00:08:47,280 --> 00:08:50,649 proved in part one, because we established correctness of Dijkstra only 150 00:08:50,649 --> 00:08:54,494 under the assumption that all edge links are non-negative, an assumption which is 151 00:08:54,494 --> 00:08:58,101 violated in this particular example. But again, the takeaway point here is 152 00:08:58,101 --> 00:09:01,897 that, you know, it's easy to write down a greedy algorithm, especially if you came 153 00:09:01,897 --> 00:09:04,745 up with it yourself. You probably believe deep in your heart 154 00:09:04,745 --> 00:09:08,447 that it's got to be correct all the time, but more often than not, probably your 155 00:09:08,447 --> 00:09:10,820 greedy heuristic is nothing more than a heuristic. 156 00:09:10,820 --> 00:09:13,795 And there will be instances in which it does the wrong thing. 157 00:09:13,795 --> 00:09:17,260 So keep that in mind in greedy algorithm design. 158 00:09:17,260 --> 00:09:20,749 So now that my conscience is clear, having warned you about the perils of 159 00:09:20,749 --> 00:09:23,617 greedy algorithm design, let's turn to proofs of correctness. 160 00:09:23,617 --> 00:09:26,246 That is if you have a greedy algorithm that is correct. 161 00:09:26,246 --> 00:09:29,162 And we will see some notable examples in the coming lectures. 162 00:09:29,162 --> 00:09:31,313 How would you actually establish that effect? 163 00:09:31,313 --> 00:09:34,851 Or if you have a greedy algorithm, and you don't know whether or not it is 164 00:09:34,851 --> 00:09:36,810 correct, how would you approach trying to 165 00:09:36,810 --> 00:09:39,440 understand which one it is, whether it's correct or not? 166 00:09:39,440 --> 00:09:42,682 So let me level with you. Proving greedy algorithm is correct. 167 00:09:42,682 --> 00:09:44,862 Frankly, is sort of, more art than science. 168 00:09:44,862 --> 00:09:48,689 So, unlike the divide and conquer paradigm, where everything was somewhat 169 00:09:48,689 --> 00:09:51,613 formulaic. We had these black box ways of evaluating 170 00:09:51,613 --> 00:09:54,431 recurrences. We had this sort of, template for proving 171 00:09:54,431 --> 00:09:57,408 algorithms correct. Really, proving correctness of greedy 172 00:09:57,408 --> 00:10:01,323 algorithms takes a lot of creativity. And it has a bit of an ad hoc flavor. 173 00:10:01,323 --> 00:10:04,559 That said, as usual, to the extent that they are recurring themes. 174 00:10:04,559 --> 00:10:07,391 That is what I will spend our time together emphasizing. 175 00:10:07,391 --> 00:10:10,020 So let me tell you just again about very high level. 176 00:10:10,020 --> 00:10:13,306 How you might go about this. You, again, might want to revisit this 177 00:10:13,306 --> 00:10:17,351 context aft-, content after you've seen some examples where I think it'll make a 178 00:10:17,351 --> 00:10:20,160 lot more sense. So method one is our old friend or 179 00:10:20,160 --> 00:10:24,436 perhaps nemesis depending on your disposition, namely proofs by induction. 180 00:10:24,436 --> 00:10:29,181 Now for a greedy algorithms remember what they do, they sequentially make a bunch 181 00:10:29,181 --> 00:10:33,575 of irrevocable decisions, so here the induction is going to be on decisions 182 00:10:33,575 --> 00:10:36,914 made by the algorithm. And if you go back to our proof of 183 00:10:36,914 --> 00:10:41,307 correctness of Dijkstra's algorithm, that in fact is exactly how we proved 184 00:10:41,307 --> 00:10:45,232 Dijkstra's algorithm correct. It was by induction of the number of 185 00:10:45,232 --> 00:10:48,220 iterations, in each iteration of the main wild loop. 186 00:10:48,220 --> 00:10:50,769 Computed the shortest path to one new destination. 187 00:10:50,769 --> 00:10:54,848 And we always proof that assuming all of our previous computations were correct, 188 00:10:54,848 --> 00:10:58,621 that's the inductive hypothesis. Then so is the computation in the current 189 00:10:58,621 --> 00:11:01,171 iteration. And so then by induction, everything the 190 00:11:01,171 --> 00:11:04,026 algorithm ever does is correct. So that's a greedy 191 00:11:04,026 --> 00:11:06,983 proof by induction that a greedy algorithm can be correct. 192 00:11:06,983 --> 00:11:11,164 And we might see some more examples of those in, for other algorithms in the 193 00:11:11,164 --> 00:11:14,224 lectures to come. Some of the text books call this method 194 00:11:14,224 --> 00:11:17,691 of proof greedy stays ahead, meaning you always proof greedy's doing 195 00:11:17,691 --> 00:11:21,621 the right thing iteration by iteration. So a second approach to approving the 196 00:11:21,621 --> 00:11:25,707 correctness of greedy algorithms which works in a lot of cases is what's called 197 00:11:25,707 --> 00:11:29,065 an exchange argument. So you haven't yet seen any examples of 198 00:11:29,065 --> 00:11:32,703 exchange arguments in this class so I can't really tell you what they are but 199 00:11:32,703 --> 00:11:36,154 that's what we're going to proceed next. I'm going to argue by an exchange 200 00:11:36,154 --> 00:11:39,745 argument that a couple of difference famous greedy algorithms are in fact 201 00:11:39,745 --> 00:11:42,124 corrected. It has a couple of different flavors one 202 00:11:42,124 --> 00:11:44,083 flavor is to approach it by contradiction. 203 00:11:44,083 --> 00:11:47,581 You assume for contradiction that a greedy algorithm is incorrect and then 204 00:11:47,581 --> 00:11:51,265 you show that you can take an optimal solution and exchange two elements of 205 00:11:51,265 --> 00:11:54,996 that optimal solution and get something even better which of course contradicts 206 00:11:54,996 --> 00:11:57,655 the assumption that you started with an optimal solution. 207 00:11:57,655 --> 00:12:01,907 a different flavor would be to gradually exchange an optimal solution into the one 208 00:12:01,907 --> 00:12:05,554 output by a greedy algorithm without making the solution any worse. 209 00:12:05,554 --> 00:12:09,637 That would show that the output of the greedy algorithm is in fact optimal. 210 00:12:09,637 --> 00:12:13,992 And formally that's done by an induction on the number of exchanges required to 211 00:12:13,992 --> 00:12:18,151 transfer an optimum solution into yours. And finally, I've already said it once, 212 00:12:18,151 --> 00:12:22,165 but let me say it again, there's not a whole lot of formula behind proving 213 00:12:22,165 --> 00:12:26,336 greedy algorithms correct, you often have to be quite creative, you might have to 214 00:12:26,336 --> 00:12:30,194 stitch together aspects of method one and method two, you might have to do 215 00:12:30,194 --> 00:12:34,000 something completely different. Really, any rigorous proof is fair game.