1 00:00:00,000 --> 00:00:03,764 So now we come to one of my favorite sequence of lectures, where we going to 2 00:00:03,764 --> 00:00:07,384 discuss the famous QuickSort algorithm. If you ask professional computer 3 00:00:07,384 --> 00:00:11,622 scientists and professional programmers to draw up a list of their top five, top ten 4 00:00:11,622 --> 00:00:15,859 favorite algorithms, I'll bet you'd see QuickSort on many of those, those peoples' 5 00:00:15,859 --> 00:00:19,844 lists. So, why is that? After all, we've already discussed sorting. We already have 6 00:00:19,844 --> 00:00:23,476 a quite good and practical sorting algorithm, mainly the Merge Sort 7 00:00:23,476 --> 00:00:27,411 algorithm. Well, QuickSort, in addition to being very practical, it's competitive 8 00:00:27,411 --> 00:00:31,447 with, and often superior to, Merge Sort. So, in addition to being very practical, 9 00:00:31,447 --> 00:00:35,684 and used all the time in the real world, and in programming libraries, it's just a 10 00:00:35,684 --> 00:00:39,878 extremely elegant algorithm. When you see the code, it's just so succinct. It's so 11 00:00:39,878 --> 00:00:43,782 elegant, you just sorta wish you had come up with it yourself. Moreover, the 12 00:00:43,782 --> 00:00:47,633 mathematical analysis which explains why QuickSort runs so fast, and that 13 00:00:47,633 --> 00:00:51,906 mathematical analysis, we'll cover in detail, is very slick. So it's something 14 00:00:51,906 --> 00:00:55,998 I can cover in just about half an hour or so. So more precisely what we'll prove 15 00:00:55,998 --> 00:01:00,124 about the QuickSort algorithm is that a suitable randomized implementation runs in 16 00:01:00,124 --> 00:01:04,104 time N log N on average. And I'll tell you exactly what I mean by on average, 17 00:01:04,256 --> 00:01:08,262 later on in this sequence of lectures. And, moreover, the constants hidden in the 18 00:01:08,262 --> 00:01:12,370 Big-Oh notation are extremely small. And, that'll be evident from the analysis that 19 00:01:12,370 --> 00:01:16,477 we do. Finally, and this is one thing that differentiates QuickSort from the merge 20 00:01:16,477 --> 00:01:20,483 sort algorithm, is it operates in place. That is, it needs very little additional 21 00:01:20,483 --> 00:01:24,641 storage, beyond what's given in the input array, in order to accomplish the goal of 22 00:01:24,641 --> 00:01:28,850 sorting. Essentially, what QuickSort does is just repeated swaps within the space of 23 00:01:28,850 --> 00:01:32,501 the input array, until it finally concludes with a sorted version of the 24 00:01:32,501 --> 00:01:36,092 given array. The final thing I want to mention on this first slide is that, 25 00:01:36,092 --> 00:01:39,819 unlike most of the videos, this set of the videos will actually have an 26 00:01:39,819 --> 00:01:43,845 accompanying set of lecture notes, which I've posted on, in PDF, from the course 27 00:01:43,845 --> 00:01:47,821 website. Those are largely, redundant. They're optional, but if you want another 28 00:01:47,821 --> 00:01:51,797 treatment of what I'm gonna discuss, a written treatment, I encourage you to look 29 00:01:51,797 --> 00:01:55,574 at the lecture notes, on the course website. So, for the rest of this video, 30 00:01:55,574 --> 00:01:59,550 I'm gonna give you an overview of the ingredients of QuickSort, and what we have 31 00:01:59,550 --> 00:02:03,526 to discuss in more detail, and the rest of the lectures will give details of the 32 00:02:03,526 --> 00:02:08,875 implementation, as well as the mathematical analysis. So let's begin by 33 00:02:08,875 --> 00:02:13,445 recalling the sorting problem. This is exactly the same problem we discussed back 34 00:02:13,445 --> 00:02:17,677 when we covered Merge Sort. So we're given as input an array of n numbers in 35 00:02:17,677 --> 00:02:32,635 arbitrary order. So, for example, perhaps the input looks like this array here. And 36 00:02:32,635 --> 00:02:37,462 then what do we gotta do? We just gotta output a version of these same numbers but 37 00:02:37,462 --> 00:02:44,596 in increasing order. Like when we discussed Merge Sort, I'm gonna make a 38 00:02:44,596 --> 00:02:48,526 simplifying assumption just to keep the lectures as simple as possible. Namely I'm 39 00:02:48,526 --> 00:02:52,216 going to assume the input array has no duplicates. That is, all of the entries 40 00:02:52,216 --> 00:02:58,943 are distinct. And like with the merge sort, I encourage you to think about how 41 00:02:58,943 --> 00:03:04,085 you would alter the implementation of QuickSort so that it deals correctly with 42 00:03:04,085 --> 00:03:12,023 ties, with duplicate entries. To discuss how QuickSort works at a high-level, I need 43 00:03:12,023 --> 00:03:16,888 to introduce you to the key subroutine, and this is really the, key great idea in 44 00:03:16,888 --> 00:03:21,753 QuickSort, which is to use a subroutine which partitions the array around a pivot 45 00:03:21,753 --> 00:03:30,018 element. So what does this mean? Well, the first thing you gotta do is, you gotta 46 00:03:30,018 --> 00:03:37,779 pick one element in your array to act as a pivot element. Now eventually we'll worry 47 00:03:37,779 --> 00:03:42,342 quite a bit about exactly how we choose this magical pivot element. But for now 48 00:03:42,342 --> 00:03:47,309 you can just think of it that we pluck out the very first element in the array to act 49 00:03:47,309 --> 00:03:54,349 as the pivot. So, for example, in the input array that I mentioned on the 50 00:03:54,349 --> 00:04:01,750 previous slide, we could just use "3" as the pivot element. After you've chosen a 51 00:04:01,750 --> 00:04:06,369 pivot element, you then re-arrange the array, and re-arrange it so that every, all 52 00:04:06,369 --> 00:04:11,278 the elements which come to the left of the pivot element are less than the pivot, and 53 00:04:11,278 --> 00:04:15,840 all the elements which come after the pivot element are greater than the pivot. 54 00:04:16,740 --> 00:04:22,387 So for example, given this input array, one legitimate way to rearrange it, so that 55 00:04:22,387 --> 00:04:30,220 this holds, is the following. Perhaps in the first two entries, we have the 2 and 56 00:04:30,220 --> 00:04:35,890 the 1. Then comes the pivot element. And then comes the elements 4 through 8 57 00:04:35,890 --> 00:04:42,882 in some perhaps jingled order. So notice that the elements to the left of 58 00:04:42,882 --> 00:04:50,684 the pivot, the 2 and the 1, are indeed less than the pivot, which is 3. 59 00:04:50,684 --> 00:04:54,436 And the five elements to the right of the pivot, to the right of the 3, are 60 00:04:54,436 --> 00:05:00,689 indeed all greater than 3. Notice in the Partition subroutine, we do not 61 00:05:00,689 --> 00:05:04,689 insist that we get the relative order correct amongst those elements less than 62 00:05:04,689 --> 00:05:08,485 the pivot, or amongst those elements bigger than the pivot. So, in some sense, 63 00:05:08,485 --> 00:05:12,636 we're doing some kind of partial sorting. We're just bucketing the elements of the 64 00:05:12,636 --> 00:05:16,635 array into one bucket, those less than the pivot, and then a second bucket, those 65 00:05:16,635 --> 00:05:20,736 bigger than the pivot. And we don't care about, getting right the order amongst 66 00:05:20,736 --> 00:05:27,685 each, within each of those two buckets. So, partitioning is certainly a more 67 00:05:27,685 --> 00:05:31,877 modest goal than sorting, but it does make progress toward sorting. In particular, 68 00:05:31,877 --> 00:05:35,910 the pivot element itself winds up in its rightful position. That is, the pivot 69 00:05:35,910 --> 00:05:40,311 element winds up where it should be in the final sorted version of the array. You'll 70 00:05:40,311 --> 00:05:44,345 notice in the example, we chose as the pivot the third largest element, and it 71 00:05:44,345 --> 00:05:48,169 does, indeed, wind up in the third position of the array. So, more generally, 72 00:05:48,169 --> 00:05:52,308 where should the pivot be in the final sorted version? Well, it should be to the 73 00:05:52,308 --> 00:05:56,447 right of everything less than it. It should be to the left of everything bigger 74 00:05:56,447 --> 00:06:03,537 than it. And that's exactly what partitioning does, by definition. So, why 75 00:06:03,537 --> 00:06:06,910 is it such a good idea to have a partitioning subroutine? After all, we 76 00:06:06,910 --> 00:06:10,765 don't really care about partitioning. What we want to do is sort. Well, the point is 77 00:06:10,765 --> 00:06:14,764 that partitioning can be done quickly. It can be done in linear time. And it's a way 78 00:06:14,764 --> 00:06:18,812 of making progress toward having a sorted version of an array. And it's gonna enable 79 00:06:18,812 --> 00:06:22,474 a divide-and-conquer approach toward sorting the input array. So, in a little 80 00:06:22,474 --> 00:06:26,088 bit more detail, let me tell you about two cool facts about the Partition 81 00:06:26,088 --> 00:06:29,943 subroutine. I'm not gonna give you the code for partitioning here. I'm gonna give 82 00:06:29,943 --> 00:06:33,557 it to you on the next video. But, here are the two salient properties of the 83 00:06:33,557 --> 00:06:41,447 Partition subroutine, discussed in detail in the next video. So the first cool 84 00:06:41,447 --> 00:06:49,497 fact is that it can be implemented in linear, that, is O(N) time, where N 85 00:06:49,497 --> 00:06:53,833 is the size of the input array, and moreover, not just linear time but linear time 86 00:06:53,833 --> 00:06:58,012 with essentially no extra overhead. So we're gonna get a linear time of mutation, 87 00:06:58,012 --> 00:07:02,243 where all you do is repeated swaps. You do not allocate any additional memory. And 88 00:07:02,243 --> 00:07:09,491 that's key to the practical performance of the QuickSort algorithm. [sound] Secondly, 89 00:07:09,491 --> 00:07:14,820 it cuts down the problem size, so it enables the divide-and-conquer approach. 90 00:07:17,540 --> 00:07:21,916 Namely, after we've partitioned an array around some pivot elements, all we have to 91 00:07:21,916 --> 00:07:25,727 do is recursively sort the elements that lie on the left of the pivot. And 92 00:07:25,727 --> 00:07:30,000 recursively sort the elements that lie on the right of the pivot. And then, we'll be 93 00:07:30,000 --> 00:07:34,120 done. So, that leads us to the high-level description of the QuickSort algorithm. 94 00:07:35,720 --> 00:07:40,643 Before I give the high-level description, I should mention that this, algorithm was 95 00:07:40,643 --> 00:07:45,335 discovered by, Tony Hoare, roughly, 1961 or so. This was at the very beginning of 96 00:07:45,335 --> 00:07:50,200 Hoare's career. He was just about 26, 27 years old. He went on to do a lot of other 97 00:07:50,200 --> 00:07:54,776 contributions, and, eventually wound up winning the highest honor in computer 98 00:07:54,776 --> 00:07:59,584 science, the ACM Turing Award, in 1980. And when you see this code, I'll bet you 99 00:07:59,584 --> 00:08:04,334 feel like you wish you had come up with this yourself. It's hard not to be envious 100 00:08:04,334 --> 00:08:11,291 of the inventor of this very elegant QuickSort algorithm. So, just like in Merge Sort, 101 00:08:11,293 --> 00:08:15,716 this is gonna be a divide-and-conquer algorithm. So it takes an array of some 102 00:08:15,716 --> 00:08:20,311 length N, and if it's an array of length N, it's already sorted, and that's the 103 00:08:20,311 --> 00:08:27,850 base case and we can return. Otherwise we're gonna have two recursive calls. The 104 00:08:27,850 --> 00:08:32,052 big difference from Merge Sort is that, whereas in Merge Sort, we first split the 105 00:08:32,052 --> 00:08:35,939 array into pieces, recourse, and then combine the results, here, the recursive 106 00:08:35,939 --> 00:08:40,088 calls come last. So, the first thing we're going to do is choose a pivot element, 107 00:08:40,088 --> 00:08:44,448 then partition the array around that pivot element, and then do two recursive calls. 108 00:08:44,448 --> 00:08:51,175 And then, we'll be done. There will be no combined step, no merge step. So in the 109 00:08:51,175 --> 00:08:54,861 general case, the first thing you do is choose a pivot element. For the moment I'm 110 00:08:54,861 --> 00:08:58,412 going to be loose, leave the ChoosePivot subroutine unimplemented. There's going to 111 00:08:58,412 --> 00:09:02,189 be an interesting discussion about exactly how you should do this. For now, you just 112 00:09:02,189 --> 00:09:05,558 do it in some way, that for somehow you come up with one pivot element. For 113 00:09:05,558 --> 00:09:12,504 example, a naive way would be to just choose the first element. Then you invoke 114 00:09:12,504 --> 00:09:21,812 the Partition subroutine that we'll discuss in the last couple slides. [sound]. So 115 00:09:21,812 --> 00:09:26,522 recall that the results in a version of the array in which the pivot element p is 116 00:09:26,522 --> 00:09:31,061 in its rightful position, everything to the left of p is less than p, everything 117 00:09:31,061 --> 00:09:35,771 to the right of the pivot is bigger than the pivot, and then all you have to do to 118 00:09:35,771 --> 00:09:43,605 finish up is recurse on both sides. So let's call the elements less than p the 119 00:09:43,605 --> 00:09:48,979 first part of the partitioned array, and the elements greater than p the second 120 00:09:48,979 --> 00:09:57,311 part of the recursive array. And now we just call QuickSort again to 121 00:09:57,311 --> 00:10:04,417 recursively sort the first part, and then the, recursively sort the second part. And 122 00:10:04,417 --> 00:10:08,423 that is it. That is the entire QuickSort algorithm at the high-level. This is one 123 00:10:08,423 --> 00:10:12,577 of the relatively rare recursive, divide- and-conquer algorithms that you're going 124 00:10:12,577 --> 00:10:16,434 to see, where you literally do no work after solving the sub-problems. There is 125 00:10:16,434 --> 00:10:20,292 no combine step, no merge step. Once you've partitioned, you just sort the two 126 00:10:20,292 --> 00:10:26,001 sides and you're done. So that's the high- level description of the QuickSort 127 00:10:26,012 --> 00:10:29,803 algorithm. Let me give you a quick tour of what the rest of the video's going to be 128 00:10:29,803 --> 00:10:33,594 about. So first of all I owe you details on this Partition subroutine. I promise 129 00:10:33,594 --> 00:10:37,340 you it can be implemented in linear time with no additional memory. So I'll show 130 00:10:37,340 --> 00:10:41,040 you an implementation of that on the next video. We'll have a short video that 131 00:10:41,040 --> 00:10:44,723 formally proves correctness of the QuickSort algorithm. I think most of you 132 00:10:44,723 --> 00:10:48,265 will kinda see intuitively why it's correct. So, that's a video you can skip 133 00:10:48,265 --> 00:10:51,995 if you'd want. But if you do want to see what a formal proof of correctness for a 134 00:10:51,995 --> 00:10:55,631 divide-and-conquer algorithm looks like, you might want to check out that video. 135 00:10:55,631 --> 00:10:59,503 Then, we'll be discussing exactly how the pivot is chosen. It turns out the running 136 00:10:59,503 --> 00:11:03,281 time of QuickSort depends on what pivot you choose. So, we're gonna have to think 137 00:11:03,281 --> 00:11:06,681 carefully about that. Then, we'll introduce randomized QuickSort, which is 138 00:11:06,681 --> 00:11:10,506 where you choose a pivot element uniformly at random from the given array, hoping 139 00:11:10,506 --> 00:11:14,615 that a random pivot is going to be pretty good, sufficiently often. And then we'll 140 00:11:14,615 --> 00:11:18,611 give the mathematical analysis in three parts. We'll prove that the QuickSort algorithm runs in 141 00:11:18,611 --> 00:11:22,475 N log N time, with small constants, on average, for a randomly chosen pivot. In 142 00:11:22,475 --> 00:11:26,634 the first analysis video, I'll introduce a general decomposition principle of how you 143 00:11:26,634 --> 00:11:30,547 take a complicated random variable, break it into indicator random variables, and 144 00:11:30,547 --> 00:11:34,070 use linearity of expectation to get a relatively simple analysis. That's 145 00:11:34,070 --> 00:11:37,983 something we'll use a couple more times in the course. For example, when we study 146 00:11:37,983 --> 00:11:42,093 hashing. Then, we'll discuss sort of the key insight behind the QuickSort analysis, 147 00:11:42,093 --> 00:11:45,762 which is about understanding the probability that a given pair of elements 148 00:11:45,762 --> 00:11:49,459 gets compared at some point in the algorithm. That'll be the second part. And 149 00:11:49,459 --> 00:11:53,003 then there's going to be some mathematical computations just to sort of tie 150 00:11:53,003 --> 00:11:56,874 everything together and that will give us the bound the QuickSort running time. 151 00:11:56,874 --> 00:12:00,698 Another video that's available is a review of some basic probability concepts for 152 00:12:00,698 --> 00:12:04,708 those of you that are rusty, and they will be using in the analysis of QuickSort. Okay? 153 00:12:04,708 --> 00:12:07,460 So that's it for the overview, let's move on to the details.