So let's review this story so far. We've been discussing the quick sort algorithm, here again is its high level description. So quick sort you call two subroutines first and then you make two recursive calls. So the first subroutine, ChoosePivot , we haven't discussed yet at all. That will be one of the main topics of this video. But the job of the choose pivot subroutine is to somehow select one of the N elements in the input array to act as a pivot element. Now, what does it mean to be a pivot? Well, that comes into play in the second subroutine. The partition subroutine, which we did discuss quite a bit in a previous video. So what partition does is it rearranges the elements in the input array, so that it has the following property. So that the pivot P winds up in its rightful position. That is, it's to the right of all of the elements less than it. And it's to the left of all of the elements bigger than it. The stuff less than it's to the left in some jumbled order. The stuff bigger than it is to the right in some jumbled order. That's what's listed here as the first part and the second part of the partitioned array. Now, once you've done this partitioning you're good to go. You just recursively solve recursively sort the first part. After you get them in the right order, you call Quick Sort again and recursively sort the right part. And, bingo, the entire array is sorted. You don't need to combine step. You don't need a [inaudible] step. Moreover, recall in a previous video we saw that the partition array can be implemented in linear time and, moreover, it works in place with essentially, no additional storage. We, also, in an optional video formally prove the correctness of Quick Sort. And remember, QuickSort is independent of how you implement the ChoosePivot subroutine. So what we're going to do now is discuss the running time of the QuickSort algorithm. And this is where the choice of the pivot is very important. So what everybody should be wondering about at this point is, "Is Quick Sort a good algorithm? Does it run fast?" The bar is pretty high. We already have MergeSort, which is a very excellent, practical, n log n algorithm. [sound] The key point to realize at this juncture is that we are not currently in a position to discuss the running time of the quick sort algorithm, the reason is we do not have enough information. The running time of the quick sort depends crucially on how you choose the pivot. It depends crucially on the quality of the pivot chosen. You'd be right to wonder what I mean by a pivot's quality. And basiclaly what I mean is pivot is good if it splits the partitioned array into roughly two equal-sized subproblems. And it's bad, it's of low quality if we get very unbalanced subproblems. So to understand both what I mean and the ramifications of having good-quality and bad-quality pivots, let's walk through a couple of quiz questions. This first quiz, quiz question is meant to explore a sort of worst case execution of the quick sort algorithm. What happens when you choose pivots that are very poorly suited for the particular input array. Let me be more specific. Suppose we use the most naive choose pivot implementation like we were discussing in the partition video. So, remember, here we just pluck out the first element of the array and we use that as the pivot. So suppose that's how we implement the choose pivot subroutine, and moreover, suppose that the input array to quicksort is an array that's already in sorted order. So, for example, if we just have the numbers one through eight, it would be one, two, three, four, five, six, seven, eight, in order. My question for you is, what is the running time of this recursive quicksort algorithm on an already sorted array if we always use the first element of a subarray as the pivot? Okay so this is a slightly tricky, but actually a very important question. So the answer is the fourth one. So it turns out that quick sort, if you pass it an already sorted array and you're using the first element as pivot elements, it runs in quadratic time. And remember, for a sorting algorithm, quadratic is bad. It's bad in the sense that we can do better. Merge short runs in time n log n, which is much better that n-squared, and if we are happy with an n-squared running time, we wouldn't have to resort to these sort of relatively exotic sorting algorithms. We could just use insertion sort and we'd be fine, we'd get that same quadratic running time. Okay so now I owe you an explanation. Why is it that QuickSort can actually run in quadratic time in this unlucky case of being passed already sorted input array. Well, to understand, let's think about what pivot gets chosen and what are the ramifications of that pivot choice for how the array gets partitioned, and then what the recursion looks like. So. Let's just think of the array as being the numbers one through n in sorted order. What is gonna be our pivot? Well, by definition we're choosing the first element of the pivot, so the pivot is just gonna be one. Now we're going to invoke the partition subroutine and if you go back to the pseudo code of the partition's subroutine, you'll notice that if you pass in an already sorted array, it's gonna do essentially nothing. It's just gonna advance the n [inaudible] and j until it falls off the end of the array. And it's just going to return back to us the same array that it was pathed as input. So, partitioned subroutine is given an already sorted array, returns an already sorted array. Okay? So we have just a pivot one in the first position and then the numbers two through n in order in the remainder of the positions. So if we draw our usual picture of what a, partitioned array looks like, with everything, less than the pivot, to the left, everything bigger than the pivot, to the right. Well, since nothing is less than the pivot, this stuff is going to be empty. This will not exist, and to the right of the pivot, this will have length, N minus one, and moreover it will still be sorted. So once partition completes, we go back to the outer call of quick sort, which then calls itself recursively twice. Now in this case, one of the recursive calls is just vacuous. There's just an empty array. There's nothing to do. So really there's only one recursive call, and that happens on a problem of size only one less. So this is about the most unbalanced split we could possibly see. Right? Where one side has zero elements, one side's N minus one. That's not really good any worse than that. And this is going to keep happening over and over and over again. We're gonna recurse on the numbers two through n. We're gonna choose the first element, the two as the pivot. Again, we'll feed it to partition, we'll get back the exact same subarray that we handed it in, we get to the numbers two through n in sorted order. We exclude the pivot two. We recourse on the numbers three through n, a subarray of length n - two. The next recursion level, we recurse on an array of size, of length n - three, then n - four, then n - five, and so on, until finally, after [inaudible] a recursion depth of n roughly, we got down to just the last element. N, the base case kicks in and we return that and Quick Sort completes. So that's how Quick Sort is gonna execute on this particular input with these particular pivot choices, so what running time does that give to us? Well. The first observation is that, you know in each recursive call we do have to invoke the partition at sub-routine, and the partition sub-routine does look at, every element in the array it is passed as input. So if we pass partition in array of like K, it's gonna do at least K operations Cuz it looks at each element at least once. So the run time is going to be bounded below, by the work we do on the outermost call, which is on an array of like M. Plus the amount we do in the second level of recursion, which is on a subarray of length n - one, plus n - two, plus blah, blah, blah, blah, blah, all the way down to plus one for the very last level of recursion. So this is a lower ban on our running time and this is already theta of n squared. So one easy way to see why this sum, N+N-1+ etcetera, etcetera, leads to a bound of N squared, [inaudible] just focus on the first half of the terms. So the first N over two terms in the sum are all of magnitude at least N over two. So the sums at least N squared over four. It's also evident that this sum is at most, N squared. So, overall, the running time of Quick Sort on this bad input is going to be quadratic. Now having understood what the worst case performance the Quick Sort algorithm is, let's move on to discuss its best case running time. Now we don't generally care about the best case performance of algorithms for its own case, the reason that we wanna think about Quick Sort in the best case, first of all it'll give us better intuition for how the algorithm works, second of all it will draw a line in the sand. Its average case running time certainly can't be better than the best case so this will give us a target for what we're shooting for in our subsequent mathematical analysis. So what with the best case, what is the highest quality pivot we could hope for, well again. We think of the quality of a pivot as the amount of balance that it provides between the two subproblems. So ideally, we choose a pivot which gave us two subproblems, both of size n/2 or less. And there's a name for the. Element that would give us that perfectly balanced split. It's the median element of the array, okay? The element where exactly half of the elements are less than it and half of the elements are bigger than it. That would give us essentially perfect 50/50 split of the input array. So here's the question. Suppose we had some input and we ran quicksort and everything just worked out in our favor in the, magically in the best possible way. That is, in every single recursive indication of quicksort on any sub-array of the original input array, suppose we happen. To get as our pivot the median element. That is, suppose in every single recursive call we wind up getting a perfect 50, 50 split of the input array before we recurse. This question asks you to analyze the running time of this algorithm in this magical best case scenario. So the answer to this third to this question is the third option. The answer is it runs in n log n time. Why is that? Well, the reason is that then the recurrence which governs the running time of quicksort is, exactly matches the recurrence that governs the merge sort running time, which we already know is n log n. That is, the running time Quick Sort requires, in this magical, special case, on an array of like N well, as usual you have a recurrence in two parts. There is the work that gets done by the recursive calls and there is the work that gets done now. Now by assumption, we wind up picking the Median as the Pivot. So there's going to be two recursive calls, each of which will be on [inaudible] input of size of most and over two, and we can write this. This is because the Pivot equals the Mid-Median. So this is not true for quicksort in general. It's only true in this magical case where the pivot is the median. So that's what's gets done by the two recursive calls, and then how much work do we do outside of the recursive calls? Well, we have to do the choose pivot subroutine, and I guess strictly speaking I haven't said how that was implemented, but let's assume that choose pivot does only a linear amount of work. And then as we've seen, the partition subroutine only does a linear amount of work as well. So let's say oh then, for work outside of the recursive calls. And what do we know? We know this implies [inaudible] using the master method or just by using the exact same argument as from [inaudible]. This gives us a running time bound of N log in. Then again something I haven't really been emphasizing but which is true is that actually we can write. Theta of N log N. And that's because, in the recurrence, in fact, we know that the work done outside of the recursive calls is exactly theta of N, okay? Partition needs really linear time, not just big O of N time. In fact, the work done outside of the recursive calls is theta of N. That's because the partition subroutine does indeed look at every entry in the array that it's passed. And as a result, we didn't really discuss this so much in the master method. But as I mentioned in passing, if you have recurrences which are tight in this sense, then the, the results of the master method, can also be strengthened to be theta instead of just big O. But those are just some extra details. The upshot of this quiz is that even in the best case, even if we magically get perfect pivots throughout the entire trajectory of quicksort, the best we can hope for is an n log n upper bound. It's not gonna get any better than that. So the question is, how do we have a principled way of choosing pivots so that we get this best case, or something like it that's best case n log n running time? So that's what that problem that we have to solve next. So the last couple quizzes have identified a super important question as far as the implementation of Quick Sort, which is how are we going to choose these pivots, right. We now know they have a big influence on the running time of our algorithm. It could be as bad as N squared or as good as N log N. We really want to be on the N log N side. So key question. How to choose pivots. [sound] And quicksort is the first killer application we're going to see of the idea of randomized algorithms, that is, allowing your algorithms to flip coins in the code so that you get some kind of good performance on average. So the big idea. Is random [sound] pivots. By which I mean. For every time we recursively call quick sort and we are passed some sub-array of length k. Among the k candidate pivot elements in the subarray, we're going to choose each one with eq, equally likely. With probability of one over K, and we're gonna make a new random choice every time we have a recursive call. And then we're just gonna see how the algorithm does. So this is our first example of a randomized algorithm. This is an algorithm, where, if you feed it exactly the same input, it will actually run differently on different executions. And that?s ?cause there's randomness internal to the code of the algorithm. Now, it's not necessarily intuitive that randomization should have any purpose in computation, in software design and algorithm design. But, in fact, and this has been sort of one of the real breakthroughs in algorithm design. Mostly in the'70s, realizing how important this is. That the use of randomization can make algorithms more elegant. Simpler, easer to code, more faster. Or just simply you can solve problems that you could not solve, at least not solve as easily without the use of randomization so it's really one thing that should be in your toolbox as an algorithm designer, randomization. Quick Sort will be the first kill, killer application but we'll see a couple more later in the course. Now, by the end of this sequence of videos, I'll have given you a complete, rigorous argument about why this works. Why with random pivots, quicksort always runs very quickly on average. But, you know, before we begin to say anything too formal, let's develop a little bit of. Intuition, or at least kind of a daydream, about why on earth could this possibly work. Why on earth could this possibly be a good idea, to have randomness internal to our quicksort implementation. Well, so first just, you know, very high level. What would be sort of the hope or the dream? The hope would be, you know, random pivots are not gonna be perfect. I mean, you're not gonna just sort of guess the median. You only have a one in n chance of figuring out which one the median is. But the hope is that most choices of a pivot will be good enough. So that's pretty fuzzy. Let's drill down a little bit and develop this intuition further. Let me describe it in two steps. [sound] The first claim is that, you know in our last quiz we said suppose we get lucky and we always pick the median in every single recursive call. We observed we'd do great. We'd get enlogin running time. So now let?s observe that actually to get the enlogin running time that it's not important that we magically get the median every single recursive call. If we get any kind of reasonable pivot by which a pivot that gives us some kind of approximately balanced split of the problems again, we are gonna be good. So the last quiz really wasn't particular to getting the exact median. Near medians are also fine. To be concrete, suppose we always pick a pivot which guarantees us a split of 25 to 75 or better. That is both recursive calls should be called on a raise of size. And most 75 percent of the one that we started with. So precisely, if we always get a 25/75 split or better in every recursive call. I claim that the running time of Quick Sort in that event will be Big O of N log N. Just like it was in the last quiz where we were, we were actually assuming something much stronger, that we were getting the median. Now this is not so obvious, the fact that 25/75 splits guarantee N log N running time. For those of you that are feeling keen, you might wanna try to prove this. You can prove this using a recursion tree argument. But because you don't have balanced subproblems, you have to work a little bit harder than you do in the cases covered by the master method. So that's the first part of the intuition. And this is what we mean by [inaudible] being good enough. If we get a 25/75 split or better, we're good to go. If we get our desired, our target N log N running time. So the second part of the intuition is to realize that, actually, we don't have to get all that lucky to just be getting a 25/75 split. That's actually a pretty modest goal. And even this modest goal is enough to get the N log N running time. Right? So suppose our array contains the numbers, the integers between one and 100, so it's an array of length 100. Think for a second, which of those elements is gonna give us a split? That's 25, 75, or better. So if we pick any element between 26 and 75. Inclusive. Will be totally good. Right. We pick something that's at least 26. That means that the left subproblem is going to have at least the elements one through 25. That'll even have at least 25 percent of the elements. If we pick something less than 75, then the right subproblem will have all of the elements 76 through 100 after we partition. So, we'll also have at least 25 percent of the elements. So anything between 26 and 75 gives us a 75, 25 split or better. But that's a full half of the elements. So it's as good as just flipping one [inaudible] fair coin and hoping we get heads. So with 50 percent probability, we get a split good enough to get the. Isn't enough to get this N log inbound. And so, again, the high level hope is that often enough, you know, half of the time, we get these you know, good enough splits. 25, 75 splits or better. So, that would soon suggest and end log and running time of average is a legitimate hope. So thats the hell of a intuition, but if I were you I would certainly not be content with this somewhat hand wavy explanation that I've given you so far. What I've told you is sort of the hope, the dream, why there is at least a chance, this might work. But, but the question remains, and I would encourage such skepticism, which is, does this really work. And to answer that we're gonna have to do some actual mathematical analysis and that's what I am gonna show you. I am gonna show you complete vigorous analysis of the quick sword algorithm with random pivots and we'll show that yes in fact it does really work. And this highlights what's gonna be a recurring theme in this course and a recurring theme in this just study and understanding of algorithm which is quite often there's some fundamental problem we are trying to go to the solution and you come up with a novel idea. It might be brilliant. And it might suck, and you have no idea. Now obviously you can cut up the idea, run it on some concrete instances, and get a feel for you know, whether it seems like a good idea or not, but if you really wanna know fundamentally what makes the idea good or what makes the idea bad, really you need to turn to mathematical analysis to give you a complete explanation. And that's exactly what we're going to do with Quick Sort and then we'll explain in a very deep way why it works so well. Specifically, in the next sequence of three videos, I'm going to show you an analysis, a proof of the following theorem about quicksort. So under no assumptions about the data, that is for every input array of a given length, say n, the average running time. A quick sort implemented with random pivots is [inaudible] than log in. And again, In fact it's theta and then log in, but we'll just focus on the big O of N log in part. So this is a very, very cool theorem about this randomized Quick Sort algorithm. One thing I wanna be clear so that you don't undersell this guarantee in your own mind, this is a worst case guarantee with respect to the input. If you'll notice at the beginning of this theorem, what do we say, for every input array of linked in. Alright so we have absolutely no assumptions about the data so this is a totally and general purpose sorting subroutine which you can use whenever you want even if you have no idea where the data's coming from and these guarantees are still going to be true. This of course is something I held forth about at some length back in our Guiding Principles video. When I argued that, if you can get away with it, what you really want is general purpose algorithms which make no data assumptions so they can be used over and over again in all kinds of different contexts and it still have guarantees, and quicksort is one of those. So, basically, if you have a dataset and it fits in the main memory of your machine, again, sorting is a for free subroutine. In particular, quicksort the quicksort implementation is for free. So, this runs so blazingly fast. It doesn't matter what the array is. Maybe you don't even know why you want to sort it but go on ahead. Why not? Maybe it'll make your life easier like it did, for example, in the clo. [inaudible] Payout rhythm for those of you that watched those two optional videos. Now, the word average does appear in this theorem. And, you know, as I've been harping on this, average is not over any assumptions on the data. We're certainly not assuming that the input array is random in any sense. The input array can be anything. So where is the averaging coming from? The averaging is coming only from randomness, which is internal to our algorithm. Randomness that we put in the code ourselves, that we're responsible for. So remember, randomized algorithms have the interesting property that, even if you run it on the same input over and over again, you're gonna get different executions. So the running time of a randomized algorithm depend-, you know, can vary as you run it on the same input over and over again. The quizzes have taught us that the running time of Quick Sort on a given input fluctuates from, anywhere between the best case of N log N to the worst case of N squared. So what this theorem is telling us is that, for every possible input array. While the running time does indeed fluctuate between an upper bound of N squared, and the lower bound of N log N. The best case is dominating. On average, it's N log N. On average, it's almost as good as the best case. That's what's so amazing about Quick Sort. [inaudible] N squared that can pop up once in a while, has, it doesn't matter. You're never gonna see it. You're always gonna see this N log N like behavior in randomized quick sort. So, for some of you, I'll see you next in a video on probability review, that's optional. For the rest of you, I'll see you in the analysis of this theorem.