So we're almost at the finish line of our analysis of quick sort. Let me remind you what we're proving. We're proving that for the randomized implementation of quick sort where we always choose the pivot element to partition around uniformly at random, we're showing that for every array, every input of length N, the average running time of quick sort over the random choices of pivots is [inaudible] of N log N. So we've done a lot of work in the last couple of videos. Let me just remind you about the stories so far. In the first video what we did is we identified the relevant random variable that we cared about, capital C, the number of comparisons that Quicksort makes among the pairs of elements in the input array. Then we applied the decomposition approach. We expressed capital C, the overall number of comparisons, as a sum of indicator or 0-1 random variables. For each of those variables XIJ, just counted the number of comparisons involving the Ith smallest and Jth smallest entries in the array, and that's gonna be either zero or one. Then we applied linearity of expectation to realize, all we really needed to understand was the comparison probabilities for different pairs of elements. [inaudible]. Second video we nailed what that comparison probability is, specifically, for the I smallest and the J smallest elements in the array, the probability that quick sort compares them when you always make random [inaudible] choices is exactly. Two divided by the quantity J minus I. Plus one. So putting that all together, yields the following expression, governing the average number of comparisons made by quick sort. One thing I want you to appreciate is, is in the last couple of videos, we've been sort of amazingly exact as algorithmic analysis goes. Specifically we've done nothing sloppy whatsoever. We've done no estimates. The number of comparisons that quick store makes on average is exactly this double sum. Now surely we'll do some inequalities to make our lives a little bit easier. But up to this point everything has been completely exact. And this will actually see why there's small constants in the, in the, in quick sort. It's basically going to be this factor two. Now the next question to ask is, what are we shooting for? Remember the theorem we want to prove is that the expected number of comparisons really the expected run time is all of N log N, so we're already done. Well not quite we're gonna have to be a little bit clever, so if we're looking at this double sum, and we ask how big are the sum ends and how many terms are there? Well the biggest sum ends we're ever going to see are when I and J are right next to each other when J is one bigger than I, and in that case this fraction is gonna be one half. So the terms can be as big as one half, how many terms are there? Well there's a quadratic number of terms. So it would be very easy to derive an upper bound that's quadratic in N, but that's not what we want. We want one that's N log N. So to drive that, we're gonna have to be a little bit more clever about how we evaluate this sum. So, the idea is, what we're going to do, is to think about a fixed value of I in this outermost sum. And then we're gonna ask, how big could the inner sum be? So let's fix some value of I, the value of the index in the outer sum. And then let's look at the inner sum, where J ranges from I plus one up to N, and the value of the sum end is one over the quantity J minus I plus one. So how big can this be? Well, let's first understand what the terms actually are. So J starts at I plus one and then it ascends to N. And as J gets bigger the denominator gets bigger. So the sum ends get smaller. So the biggest sum end is gonna be the very first one. And J is as small as possible. Namely I plus one. When J is I plus one the sum end is one half. Then J gets incremented in the sum. And so that's, we're gonna pick up a one third term followed by one fourth term, and so on. So there's gonna be, for every inner sum is gonna have a this form, one-half plus one-half equals one-fourth. And then it's gonna sort of run out at some point, when J equals N. And the biggest term we're ever going to see is gonna be a one over N, in the case where I equals one. So. Let's make our lives easier by taking this expression we started with. Star, and instead of having a double sum, let's just upper bound this with a single sum. So what are the ingredients of a single sum? Well, there's this two, can't forget the two. Then there's N choices for I, actually, there's N minus one choices for I, but let's just be sloppy and say N choices. So that gives us a factor N. And then how big can an inner sum be? Well, inner sum is just a bunch of these terms, one-half plus one-third and so on. The biggest of those inner sums is the one occurring when I equals one, at W, at which point the last term is one over N. So, we're gonna just do a change of variable and express the inner [inaudible], upper bound on each inner sum as the sum from K equal two to N of one over K. So that's looking more manageable just having the single sum involving this index K, and life's gonna get really good when we prove the next claim, which is that this sum cannot be very big, it's only logarithmic in N, even though there's a linear number of sum N's, the overall value of the sum is only logarithmic. That, of course, is gonna complete the proof, 'cause that'll give us an overall bound of two times N times the natural log on N. So it's an N login bound with really quite reasonable constants. So, why is this true? Why is this sum only logarithmically large? Well, let's do a proof by a picture. I'm going to write this sum. In a geometric fashion. So on the X axis, let me mark off points corresponding to the positive integers. And on the Y axis, let me mark off points corresponding to fractions of the form, one over K. And what I?m gonna do is gonna draw a bunch of rectangles. Of decreasing area, specifically they all have with one, and the heights are gonna be like one over K. So the area of this guy's one, the area of this guy's one half, the area of this guy's one third, and so on. And now I'm going to overlay on this picture the graph of the function, the continuous function, F of X equals one over X. So notice that is going to go through these three points. It's gonna kiss all of these rectangles on their upper right corners. Now what is it we're trying to prove? The claim we're trying to prove is that this sum, one half plus one third and so on, is upper bounded by something, so the sum can be just thought of as the areas in these rectangles, the one half, the one third and so on, and we're going to upper bound it by the area under the blue curve, if you notice the area under the blue curve is at least as big as the sum of the areas of the rectangles because the curve hits each of these rectangles in its north east corner. So putting that into mathematics, the sum from K equal two to N of one over K. Is met in above by the integral. And we'll start the area of the curve at one. And then we need it to go all the way up to N. Of the function one over X. The X, so that's the area under the curve. And if you remember a little bit of calculus the integral of one over X is the natural log of X. So this equals the natural log of X. Evaluated at one. Also known as login minus log one. And of course log one would be zero, so that gives us our login. So that completes the proof of the claim. That indeed, the sum of these one over K's is bounded above by the natural log of N, and that in fact completes the proof of the theorem. You've got to be the expected number of comparisons, at most two N times this sum, which is at most log N. And altogether, we find that the expected number of comparisons that quick sort makes on an arbitrary input of length N. Is two times N times the natural log of N. So that would be big o of N, log N, with quite reasonable constants. Now, this is just the number of comparisons, but as we observed earlier, the running time of Quicksort on average is not much more than that, the running time is dominated by the number of comparisons that it makes. Moreover, as we discussed when we were talking about the details of the implementation, it works in place, essentially no extra storage is necessary. So that is a complete and mathematically rigorous explanation of just why Quicksort. Is so quick.