Let's complete the proof of the master method. Let me remind you about the story so far, the first thing we did is we analyzed the work done by a recursive algorithm using a recursion tree. So we zoomed in on a given level J, we identified the total amount of work done at level J and then we summed up over all of the levels resulting in this rather intimidating expression star. C into the D times a sum over the levels J from zero to log base B of N of quantity A over B to the B raised to the J. Having derived this expression star we then spent some time interpreting it, attaching to it some semantics Sticks. And we realize that the roll of this ratio A to the B over D, is to distinguish between three fundamentally different types of recursion trees. Those in which A = B to the D and the amount of work is the same at every level. Those in which A is less than B to the D and therefore the amount of work is going down with the level. And those where A is bigger than B to the D in which case the amount of work is growing with the level. This gave us intuition about the three cases of the master method and even gave us predictions f or the running times we might see. So what remains to do is turn this hopeful intuition into. A rigorous proof. So we need to verify that in fact the simplest possible scenarios outlined in the previous video. Actually occur. In addition, we need to demystify the third case and understand what the expression has to do with the number of leaves of the recursion tree. Let's begin with the simplest case, which is case one. We're calling case one, we're assuming that A equals B to the D. This is the case where we have a perfect equilibrium between the forces of good and evil. Where the rate of the sub problem proliferation exactly cancels out with a rate at which we do less work per sub problem. And now, examining the expression, star, we can see how easy our lives get when A equals B to the D. In that case, this ratio is equal to one. So naturally this ratio raised to the J is also equal to one for all J. And then of course this sum evaluates to something very simple. Namely one summed with itself log base B of N plus one times. So the sum simply equals log base B of N. Plus one, and that's going to get multiplied by. This CN to the D term which is independent of the sum. So summarizing, when A equals B to the D, we find that star equals CN to the D times log base B of N plus one. Writing this in big o notation, we would write big o of end of a d login. And again, I'm going to suppress the base of the logarithms. Since all logarithms differ only by a constant factor we don't have to specify the base. That's just suppressed by the constant hidden in the big O notation. So that's it for case one. Like I said, this is the easy case. So what do we do when A is not equal to B to the D? And remember A could be either less than or bigger than B to the D. To answer that question, let's take a short detour into geometric series. For this single slide detour we're going to think about a single constant number R. Now, what you want to think about is R. Representing that ratio A. Over B. To the D. From the previous slot. But for this slide only let's just call it R. This is a constant. It's bigger than zero, and it's not equal to one. Now, suppose we sum up powers of R stopping, let's say, at the Kth power of R. I claim that this sum has a nice closed form formula. Specifically it is exactly, R. To the K. Plus one, minus one. Divided by or a minus one. Now, whenever you see a general formula like this, it's useful to keep in mind a couple of canonical values of the parameters that you can plug in to develop intuition. And for this expression, you might wanna think canonically about the cases, R=2, and R=1/2. So when R=2, or something that powers a two. One+2+4+8+16, and so on. One hour's a half, [inaudible] have one, plus a half, plus a quarter, plus an eighth, and so on. Now I'm not gonna prove this for you, I'd like you to prove this yourself. If you don't already know this fact. So the way to prove this is simply by induction. And I will leave this an, an exercise. What I wanna focus on instead is what this fact can do for us. The way that we use this fact is to formalize the idea that, that in recursion trees where the amount of work is increasing in the levels, the leaves dominate the overall running time. And where recursion trees, where the amount of work is decreasing in the level, the root dominates the running time. In the sense that we can ignore all of the other levels of the recursion tree. So, and in the vision in this slide, we have two upshots. First of all, for the case when R is less than one. And in this case, this expression on the right-hand side. R to the Q plus one minus one over R minus one can be upper bounded by one over one minus R. So again, remember, you might want to have a canonical value of r in mind here, namely, one half. So what we're claiming here is that the right hand side is nor more than two for the case of R=1/2. And that's easy to see if you think about one plus one-half plus a one-fourth plus one-eighth and so on. That sum is converging to, to as k grows large. So in general, for our less than one constant, the sum is divided by one minus one over R. Now, we're not actually gonna care about this formula, one minus one over R. The point for us is just that this is a constant. And by constant, I mean independent of K, independent of how many terms we sum up. Obviously, it depends on R of the ratio, but it does not depend on how many things we sum up on K. So the way to think about this is, when we sum up a bunch of terms where R is less than one, then the very first term dominates. The first term is with a one. And no matter how many terms we sum up, we never get, grow bigger than the sum constant. A similar situation holds for the case where r is a constant bigger than one. When r is bigger than one. A tiny bit of algebra shows that we can upper bound the right hand side by r to the k. Times something which is constant, independent of K. So again, let's interpret the second upshot in terms of a canonical value of R. Namely, R equals two. Then our sum is one plus two plus four plus eight plus sixteen, and so on. And what this is saying is that no matter how many terms we sum up, the overall sum is never gonna be more than twice. The largest and final term. So if we sum up to say 128, the sum, you'll notice, will be 255, which is, at most, twice that largest term, 128. And that saying is true for any K. The entire sum is no more than twice that of the largest term. In this sense, the largest term in the series dominates the whole thing. So to summarize this slide in just into one sentence we sum up powers of a constant R when R is bigger than one the largest power of that constant dominate to the sun when R is smaller than one then the sun is just a constant. Let's now apply this to prove case two of the master method. In case two of the master method, we assume that A is less than B to the D. That is, the rate at which sub problems are proliferating is drowned out by the rate at which we do less work per sub problem. So this is the case where the amount of work is decreasing with each level of the recursion tree. And our intuition said that, well, in the simplest possible scenario, we might hope that all of the work, up to a constant factor, is being done at the root. So let's make that intuition precise by using the basic Sums fact on the previous slide. So, since A is less than B to the D. This ration is less than one. So let's call this ratio equal to R. So R, you'll notice, does depend on the three parameters, A, B and D. But R is a constant, it does not depend on N. So what is this sum? The sum is just, we're just summing up powers of this constant R, where R is less than one. What did we just learn? We just learned that any such sum is bounded above by a constant, independent of the number of terms that you sum up. So therefore, what is this expression star evaluates to. It evaluates to C, which is a constant, times N to the D. Times another constant. So suppressing the product of these two constants in Big O notation we can say that the expression starts upper bounded by Big O(n to the d). And this makes precise our intuition that indeed the overall running time of the algorithm, in this type of recursion tree with decreasing work per level, is dominated by the root. The overall amount of work is only a constant factor larger than the work done and merely at level zero of the tree. Let's move on to the final and most challenging part of the proof, the final case. In case three we assume that A is bigger than B to the D. So in conceptual terms, we're assuming the rate at which sub problems proliferate is exceeding the rate at which we do less work per sub problem. So these are recursion trees where the amount of work is increasing with each level, with the most work being done at the leaves. And once again, using the basic sums fact, we can make precise the hope that, in fact, we only have to worry about the leaves. We can throw away the rest of work, losing only a constant factor. So to see that, you will again denote this ratio between A and B to the D as R. And in this case R is bigger that one. So this sum is a sum of a bunch of powers of R were R is bigger than one, what did we just learn about that two slides ago in the basic sums facts, we learned that such sums are dominated by the largest and last term of the sum. Okay so the bounded it by a constant factor times the largest term. Therefore, we can we can simplify the expression star to the following. I'm gonna write it in terms of big O notation. And, like, on the last slide, I'll use it to suppress two different constants. On the one hand, I'm gonna be suppressing the constant C, which we inherited way back when from the original recurrence. And on the other hand, I'm gonna use it to also suppress this constant that comes from the basic sums fact. So ignoring those two constants, what do we have left? We have N to the D. Times the largest term of the sum. So what is the largest term of the sum? Well, it's the last one so we plug in the biggest value of J that we're ever going to see. So what's the biggest value of J that we're ever going to see? We'll it's just this. Log base B of N. So, we get the ratio A over B to the D, raised to the log base B of N. Power. Now don't despair how messy this looks. We can do some remarkable simplifications. So what I want to do next is I want to focus just on this one over B to the D, raised to the log base B of N term. So that's going to be. You can write that as B to the minus D log base B of N. Which if I factor this exponent into two successive parts I can write this as B Raise to the log base B of N power. And only then raised to the minus D. And now of course what happens is that taking the logarithm of N base B, followed by taking, raising it to the B power, those are inverse operations that cancel, so that leaves us just with the N. So this results in a end to the minus D. And now remarkably this end to the minus D is just gonna cancel out with this end to the D. Leaving us with merely. A, raise the log based B event. And thus, out of this crazy sea of letters, rises a formula we can actually understand. So A to the log based B of N, if we step back and pick for a minute, is actually a supernatural quantity. Describe something about the recursion trees that we already knew was supposed to pop up in the analysis. I'll let, I'll let you think through exactly what that is in the following quiz. So the correct answer to this quiz is the fourth response. A raised to the logarithm event is precisely the number of leaves of the recursion tree. And remember in our intuition for case three, recursion trees where the amount of work is increasing per level, we thought that perhaps the work would be dominated by the work done at the leaves which is as proportional as the number of leaves. So why is this the answer? Well just remember what recursion trees look like at level zero. We have a single node, and then with each level we have eight times as many nodes as before. That is, with each level of the recursion tree, the number of nodes goes up by a factor of A. How far does this, how long does this process go on? Well, it goes on until we reach down the, the leaves. Recall that in the input size starts at N up at the root. It gets divided by a factor of B each time, and it terminates once we get down to one. So the leaves preside at precisely level log base B of N. So therefore. The number of leaves is just a branching factor which is A raised to the number of times that we actually multiply by A which is just the number of levels which is log base b n. So each time we go down a level we increase the number of nodes by a factor of A and we go down a level log base B of N times. Leaving us with a number of leaves equal to A raised to the log base B of N. So what we've done is we've mathematically confirmed, in a very cool way, our intuition about what case three should look like in the master method. We've proven that in case three when A is. Bigger than b to the d. The running time is, o of the number of leaves in the recursion tree, just as the intuition predicted. But, this leaves us with one final mystery. If you go back to the statement of the master method, we didn't say, a to the log base b of n. In case three, it says the running time is, n to the log base b of a. And, not only that, we've used this case three formula over and over again, to evaluate Gauss's recursive algorithm for integer multiplication, to evaluate the Strassen's matrix multiplication algorithm, and so on. So, what's the story? How come we're not getting the same thing, as in the statement of the master method? Well there's a very simple explanation, which is simply that, believe it or not. A log base B of N, and N to the log base B of A. Are exactly the same thing. This looks like the kind of mistake you'd make in freshmen algebra. But actually, if you think about it, these are simply the same quantity. If you don't believe me, just take the logarithm base B of both sides, and it'll give the same thing in both sides. Now, you might well be wondering why I didn't just state in the master method that the running time of case three is this very sensible and meaningful expression, a raised log based b of n, i.e., the number of leaves in the recursion tree. Well, it turns out that while this expression on the left hand side is the more meaningful conceptually. The right hand side. N. To the log base B. Of A. Is the easiest one to apply. So recall when we worked through a bunch of examples, of the master method, this right hand side was super convenient, when we evaluated the running times of out rhythms. When we plugged in the numbers of A. And B. In any case, whether or not you want to think about the running time in case three as proportional to the number of leaves in the tree or as proportional at the end of the log base B of A, we're done. We've proved it. That's case three. That was the last one. So we're done with the master method. Qed. So that was a lot of hard work for doing the master method and I would never expect someone to be able to regurgitate all of the details of this proof you know it's something like a cocktail party well maybe except the nerdiest of all cocktail parties but I do think there's a few high level conceptual points of this proof that are worth remembering in the long term, so we started by just writing down a recursion tree for the recursive algorithm and in a generic way. And going level by level, we counted up the work done by the algorithm. And this part of the proof had nothing to with how A and B related to each other. Then we recognized that there are three fundamentally different types of recursion trees. Those with the same amount of work per level, those where it increases with the level, and those where it decreases with the level. If you can remember that, you can even remember what the running times of the three cases should be. In the case where you do the same amount of every work at each level. We know there's a logarithmic number of levels. We know we do end in D work at the root. So that gives us the running time in case one had ended the day you log in. When the amount of work is decreasing with the levels, we now know that the route dominates. Up to a constant, we can throw out the rest of the levels, and we know end of the D work gets done at the root, so that's the overall running time. And in the third case, where it's increasing in the levels, the leaves dominate. The number of leaves is A raised to the log based of B of N, and that's the same as N, the log based B of A. And that's proportional to running time in case three of the master method.