The Hopcroft-Ullman guarantees states that if you implement a union-find data structure, using lazy unions, union by rank, and path compression, then you get a pretty amazing guarantee. Do any sequence you want, of m union and find operations. The total amount of work by this data structure is bounded above by big O(m) * log* of n. Where n is the number of objects in the data structure. So, on the one hand, I hope you're suitably impressed by these results. So, the proof of it that we gave on the last video is frankly, totally brilliant. And secondly, the guarantee itself is excellent. What, with log star of n being bounded above by 5 for any imaginable value of n. On the other hand, I do hope there's sort of, a little voice inside you, wondering if this log* bound could really be optimal. Can we do better, maybe by a sharper analsysis of the data structure we've been discussing, or perhaps by adding further injinuity, further optimizations To the data structure. For all we know, you might be able to get away with a linear amount of work. So O(m) work to process an arbitrary sequence of fines and unions. It turns out you can do better than the Hopcroft-Ullman guarantee. There is a sharper analysis of the exact same data structure. That analysis was given by Tarjan and here is the statement of his guarantee. So, the improved bound states that for an arbitrary sequence of m union and find operations, the total work done by this data structure union by rank and path compression is O(m) * alpha(n). Now, what you may ask is alpha(n), that's a function known of the inverse Ackermann function. So what, then, is the inverse Ackermann function? Well, the short answer is that it's a function that, incredibly, grows still more slowly, in fact, much, much, much more slowly, than the log* function we discussed in the Hopcroft-Ullman bound. The precise definition is nontrivial, and that is the subject of this video. In the next video, we'll prove this theorem. And explain how the Inverse Ackerman function arises in an optimal analysis of union by rank with path compression. So, I'll first define the Ackermann function and then I'll describe its inverse. one thing I should warn you, is you will see slight variations on these two definitions out there in literature. Different authors define them in ways convenient for their own purposes. That's also what I'm going to do here, I'm going to take one particular particular definition. Convenient for the union find data structure analysis. All of the variance that you'll see out there in the literature, exhibit roughly the same ridiculous growth. So the details aren't really important for the asymptotic analysis of data structures and algorithms. So you can think of the Ackermann function as having two arguments, which I'm going to denote by k and r. Both of these are integers, k should be at least 0, r should be be at least 1. The Ackermann function is defined recursively. The base case is when k = 0. For all r, we define A0(r) as the successor function. That is it takes its input r and it outputs r + 1. In general, when k is strictly positive, the argument r tells you how many times to apply the operator, the function, ak - 1 starting from the input r. That is, it's the r fold composition, of Ak - 1 and again applied to the argument r. So in some sense describing the Ackermann function isn't that hard. Right, I didn't really need, you know that much of a slide to actually write down its description. But getting a feel for what on earth this function is takes some work. So the first thing maybe just as a sanity check is note that it is a well defined function. So, you know, you're all programmers so you could easily imagine writing a program which took as input, k and r and at least, in principle, given enough time, computed this number. It would be a very simple recursive algorithm with a pseudo code just following the mathematic definition. So it is some function, given k and r there is some number that's the result of this definition. Now in the next sequence of quizzes, let's get a feel for exactly how this function behaves. And so let's start on the first quiz just by fixing k to be 1 and now viewing the Ackerman function with k fixed at 1 is a function purely of r. So, what function does A1 of r correspond to? Alright, so the correct answer is B. At least A1 is quite simple to understand. All it does is double the arguments. So if you give it r, it's going to spit out 2r. So why is that? Well a1(r) by definition, is just the function a0 applied to r, r times. A(0), by definition, is the successor function, it adds 1. So, if you apply the successor function r times, starting from r, you get 2r as a result. So let's now move from A1 to A2. So let's think about exactly the same question. Fixing k at 2 and regarding it as a function of r only. What function is it? The answer here is C, for every positive integer r, A2(r) = r * 2^r. And the reasoning is the same as before. So remember A2(r) just means you apply A1 r times. In the last quiz, we discovered that A1 was doubling so if you double r times, it's, it has the effect of multiplying by 2^r factor. So let's take it up a notch yet again. Let's bump up k to 3, and let's think about A3 but let's not yet think about what A3 is as a function of r. Let's keep things simple. What is just a sub 3 evaluated when r = 2. So k = 3, r = 2, that's a number, what number is it? Okay so the correct answer is the third one. It's 2048 also known as 2^11. Let's see why, well A3(2) that just means we just apply A2(2). [SOUND] Now in the last quiz we evaluated not only A2(2) but A sub 2 of all integers of r. We discovered it was r * 2^r. So that means it's a simple matter to evaluate this application of A2 twice. First A2(2) that's just 2 * 2^2 also known as 8. Then A2(8) is going to be 8 * 2^8 also known as 2^11 or 2048. So what about in general? How does the function A3 behave as a function of a general parameter r? When we answer this question, we're going to see for the first time, the ridiculously explosive growth that the Ackerman function exhibits and this will just be the tip of the iceberg, right? This is just when. k = 3. So by definition, A3(r) you start from r. You apply the function A2 r times. So let's remind ourselves what the function A2. is. We've computed that exactly. That's r * 2^r. So to make our lives simple, let's just. Think about the 2 to the r part. So in the real function A2 you also multiply by r. Lets just think about the 2^r. So imagine you applied that function over and over and over again, what would you get? Well now you get a tower of 2's, where the height of the tower is the number of times that you apply that function. But if you apply a once, you get 2 to the r. If you apply it twice, you're going to get 2^2^r. 3 * 2^2^r and so on. So, our application of A2 gives you something that's even bigger than a tower of r twos. So let's move on to A4 and this is the point at which personally my brain begins to hurt, but let's just push a little bit further, so that we appreciate the ridiculous growth of the Ackerman function. And as with A3 let's punt for the moment on understanding the dependence for general r. Let's just figure out what A4(2) is. So, k = 4, r = 2. Well, so this by definition you just take 2 and you apply A3. We computed A3(2) in the, previous quiz. We discovered that was 2,048. So that's the result of the first application of a 3. And now we find ourselves applying A3 to 2,048. So in effect, r now, is 2,048. And remember, we concluded the last slide by saying, well A3 whatever it is, it's at least as big, as a tower of r 2's. And here r is 2,000 2,048. So this of course is now the land of truly ridiculous numbers that we're dealing with. I encourage you to take a calculator or computer and see how big a tower of 2's you can even compute. And it's not going to be anywhere close to 2,000, I can promise you that. And hey, that's just A4(2). What about the function A4 for a general value of r? Well, of course in general A4(r) is going to be r applications of A3 starting at r. Now in the last slide we argued that A3. This function is bounded below by a tower function. So A3 takes in some input, some integer, and it spits out some tower with height equal to that integer. So what happens when you apply that tower function A3 over and over again? Now you get what's called an iterated tower function. So when you apply the function A3 for the first time to r. It's going to spit out a number which is bounded below, by a tower of height r. Tower of 2s, r of them. Now when you apply a sub 3 a second time, it's output is going to be a tower of 2 whose height, is lower bounded by, a tower of 2s of height R, and so on. Every time you apply A3, you're just going to iterate, this tower function. You will sometimes see the iterated tower function referred to as a wowzer function. You probably think I'm pulling your leg, but I'm not kidding you. You can look it up. So that is the Ackermann function, and a bunch of examples to appreciate just how ridiculously quickly it is growing. So now let's move on, to define it's inverse. Now the Ackermann function has two parameters, k and n, k and r, excuse me. So to define an inverse, I'm going to fix, one of those. Specifically, I'm going to fix = 2. So the inverse Ackermann function is going to be denoted by alpha, for simplicity I'll define it only for values of n that are at least 4. And it's defined, for a given value of n, as the smallest k, so that, if you apply Ak to 2, to r = 2, then the result is at least, n. So again, it's simple enough to kind of write down a definition like this. But you really have to plug in some concrete numbers to get a feel for what's going on. So, let's just write out, you know wat are the values of n that will give you alpha(n) = 1, that will give you alpha(n) = 2, alpha(n) = 3, and so on. Okay, so for starters alpha(4) is going to be equal to 1. Right, so if you applied a 0 2 2, that's the successor function, so you only get 3, so that's not at least as big as 4. On the other hand, if you apply A1 to 2, that's the doubling function, so if you apply it to 2 you get 4. So, A1 does give you a number which is at least as big as n1 and is 4. So next, I claim that the values of n, for which alpha of n=2, are the values, 5, 6, 7, and 8. Why is that true? Well, if you start from 2, and you apply A1, the doubling function, you only get 4, so A1 is not sufficient to achieve these values of n. On the other hand, if you apply A2 to 2. Remember, that's the function which given an r, outputs r * 2^r. So, when you apply it to 2. You get 2 * 2^2 also known as 8. So A2 is sufficient to map 2 to a number at least as big as 8. By the same reasoning, recall that we computed A3(2) and we computed that to be 2,048. So therefore, for all of the bigger values of n, starting at 9, and going all the way up to 2,048, their alpha value is going to be equal to 3 because that is the smallest value of k such that a sub k of 2 is at least those values of n. And then, things start getting a little bit ridiculous. So, remember we also lower bounded A4(2). We said that's at least a, tower of 2's, of height 2048. So for any n up to that value, up to a tower of 2s of height 2048, alpha, of those n is going to be equal to 4. This equation implies that the inverse Ackermann Function value of any imaginable number is at most 4 and I don't know about, I don't know about you but my brain is already hurting too much to think about which are the values of n such that alpha is equal to 5. So as a point of contrast, let's do the same experiment for the log star function, so that we get some appreciation about how as glacially growing as log star may be, the inverse Ackermann function is growing still qualitatively slower. To the log* function, remember, it's the iterated logarithm. So, it's, starting from n, how many times you needed to hit log in the calculator, let's say base 2, before the result drops below 1. So, when this log* of n can be equal to 1, well that's when n is going to be equal to 2. Then, you hit log once and you drop from 2 to 1 and you're done. If n = 3, or 4, then you need more than one application of logarithm to drop below 1, but you only need 2 applications, so log* (n) is going to be 2, for n = 3 and 4. Similarly, for n anywhere between 5, and 16, log* (n) is going to be equal to 3, right? There would be star from 16, that's 2^4. So if you apply log once you get 4, a second time you get 2, a third time you get 1. And you get 1 by analogy the values of n, for which log*(n) = 4 are going to be starting at 17, and going up to 2^16 also known as 65,536. So lets just go one more step, for which n is log*(n) = 5. Well obviously we started at 65,537, and we end at 2 raised, to the largest number, of the previous block so 2^65536. So, these numbers are impressive enough on the right hand side. There is no imaginable number of importance, for which log* is going to be bigger than 5. Looking at the left and the right hand sides though, we see that there really is a qualitative difference, between the rate of growth of these 2 functions. With the inverse Ackermann function in fact, growing much, much slower, than the log* function. So perhaps the easiest way to see that, is to look at the right hand side, and on each of the lines in the right hand side, look at the largest value of n, so that log* is a given number. So for example, on the third line, the largest n, such that log* of n = 3 is a tower of 2 of height 3. 2^2^2 also known as 16. Similarly on the fourth and fifth line, the largest n that gives value of log* = 4 and 5 are tower of 2 of height 4 and the tower of 2 of height 5. On the other hand, look at the fourth line on the left hand side. So already when we ask, about the largest values of n that have inverse Ackermann value 4, we're talking about towers of twos. In fact, of height 2048. So that is, the n which give us alpha(n) = 4, we would have to write down 2048 lines on the right-hand side before we had values of n that were equally big. This indicates how the log* function, while growing glacially indeed, is nevertheless moving at light speed compared to the inverse Ackermann function.