1 00:00:00,012 --> 00:00:04,460 The Hopcroft-Ullman guarantees states that if you implement a union-find data 2 00:00:04,460 --> 00:00:08,972 structure, using lazy unions, union by rank, and path compression, then you get 3 00:00:08,972 --> 00:00:12,763 a pretty amazing guarantee. Do any sequence you want, of m union and 4 00:00:12,763 --> 00:00:15,783 find operations. The total amount of work by this data 5 00:00:15,783 --> 00:00:19,172 structure is bounded above by big O(m) * log* of n. 6 00:00:19,172 --> 00:00:22,222 Where n is the number of objects in the data structure. 7 00:00:22,222 --> 00:00:26,339 So, on the one hand, I hope you're suitably impressed by these results. 8 00:00:26,339 --> 00:00:30,972 So, the proof of it that we gave on the last video is frankly, totally brilliant. 9 00:00:30,972 --> 00:00:33,843 And secondly, the guarantee itself is excellent. 10 00:00:33,843 --> 00:00:38,482 What, with log star of n being bounded above by 5 for any imaginable value of n. 11 00:00:38,482 --> 00:00:43,379 On the other hand, I do hope there's sort of, a little voice inside you, wondering 12 00:00:43,379 --> 00:00:45,980 if this log* bound could really be optimal. 13 00:00:45,980 --> 00:00:50,381 Can we do better, maybe by a sharper analsysis of the data structure we've 14 00:00:50,381 --> 00:00:55,312 been discussing, or perhaps by adding further injinuity, further optimizations 15 00:00:55,312 --> 00:00:59,027 To the data structure. For all we know, you might be able to get 16 00:00:59,027 --> 00:01:03,342 away with a linear amount of work. So O(m) work to process an arbitrary 17 00:01:03,342 --> 00:01:07,357 sequence of fines and unions. It turns out you can do better than the 18 00:01:07,357 --> 00:01:11,362 Hopcroft-Ullman guarantee. There is a sharper analysis of the exact 19 00:01:11,362 --> 00:01:14,822 same data structure. That analysis was given by Tarjan and 20 00:01:14,822 --> 00:01:20,223 here is the statement of his guarantee. So, the improved bound states that for an 21 00:01:20,223 --> 00:01:25,733 arbitrary sequence of m union and find operations, the total work done by this 22 00:01:25,733 --> 00:01:31,006 data structure union by rank and path compression is O(m) * alpha(n). 23 00:01:31,006 --> 00:01:37,120 Now, what you may ask is alpha(n), that's a function known of the inverse Ackermann 24 00:01:37,120 --> 00:01:40,209 function. So what, then, is the inverse Ackermann 25 00:01:40,209 --> 00:01:45,190 function? Well, the short answer is that it's a function that, incredibly, grows 26 00:01:45,190 --> 00:01:50,380 still more slowly, in fact, much, much, much more slowly, than the log* function 27 00:01:50,380 --> 00:01:52,992 we discussed in the Hopcroft-Ullman bound. 28 00:01:52,992 --> 00:01:57,655 The precise definition is nontrivial, and that is the subject of this video. 29 00:01:57,655 --> 00:02:00,632 In the next video, we'll prove this theorem. 30 00:02:00,632 --> 00:02:04,389 And explain how the Inverse Ackerman function arises in an optimal analysis of 31 00:02:04,389 --> 00:02:07,911 union by rank with path compression. So, I'll first define the Ackermann 32 00:02:07,911 --> 00:02:10,164 function and then I'll describe its inverse. 33 00:02:10,164 --> 00:02:13,942 one thing I should warn you, is you will see slight variations on these two 34 00:02:13,942 --> 00:02:17,570 definitions out there in literature. Different authors define them in ways 35 00:02:17,570 --> 00:02:20,988 convenient for their own purposes. That's also what I'm going to do here, 36 00:02:20,988 --> 00:02:24,119 I'm going to take one particular particular definition. 37 00:02:24,119 --> 00:02:27,258 Convenient for the union find data structure analysis. 38 00:02:27,258 --> 00:02:31,787 All of the variance that you'll see out there in the literature, exhibit roughly 39 00:02:31,787 --> 00:02:35,596 the same ridiculous growth. So the details aren't really important 40 00:02:35,596 --> 00:02:39,194 for the asymptotic analysis of data structures and algorithms. 41 00:02:39,194 --> 00:02:43,202 So you can think of the Ackermann function as having two arguments, which 42 00:02:43,202 --> 00:02:46,975 I'm going to denote by k and r. Both of these are integers, k should be 43 00:02:46,975 --> 00:02:52,157 at least 0, r should be be at least 1. The Ackermann function is defined 44 00:02:52,157 --> 00:02:55,237 recursively. The base case is when k = 0. 45 00:02:55,237 --> 00:02:59,887 For all r, we define A0(r) as the successor function. 46 00:02:59,887 --> 00:03:03,802 That is it takes its input r and it outputs r + 1. 47 00:03:03,802 --> 00:03:11,223 In general, when k is strictly positive, the argument r tells you how many times 48 00:03:11,223 --> 00:03:18,482 to apply the operator, the function, ak - 1 starting from the input r. 49 00:03:18,482 --> 00:03:26,109 That is, it's the r fold composition, of Ak - 1 and again applied to the argument 50 00:03:26,109 --> 00:03:29,317 r. So in some sense describing the Ackermann 51 00:03:29,317 --> 00:03:32,716 function isn't that hard. Right, I didn't really need, you know 52 00:03:32,716 --> 00:03:35,872 that much of a slide to actually write down its description. 53 00:03:35,872 --> 00:03:39,439 But getting a feel for what on earth this function is takes some work. 54 00:03:39,439 --> 00:03:43,327 So the first thing maybe just as a sanity check is note that it is a well defined 55 00:03:43,327 --> 00:03:45,777 function. So, you know, you're all programmers so 56 00:03:45,777 --> 00:03:49,302 you could easily imagine writing a program which took as input, k and r and 57 00:03:49,302 --> 00:03:52,402 at least, in principle, given enough time, computed this number. 58 00:03:52,402 --> 00:03:55,577 It would be a very simple recursive algorithm with a pseudo code just 59 00:03:55,577 --> 00:04:00,452 following the mathematic definition. So it is some function, given k and r 60 00:04:00,452 --> 00:04:04,696 there is some number that's the result of this definition. 61 00:04:04,696 --> 00:04:09,956 Now in the next sequence of quizzes, let's get a feel for exactly how this 62 00:04:09,956 --> 00:04:14,225 function behaves. And so let's start on the first quiz just 63 00:04:14,225 --> 00:04:19,641 by fixing k to be 1 and now viewing the Ackerman function with k fixed at 1 is a 64 00:04:19,641 --> 00:04:24,517 function purely of r. So, what function does A1 of r correspond 65 00:04:24,517 --> 00:04:28,191 to? Alright, so the correct answer is B. 66 00:04:28,191 --> 00:04:32,021 At least A1 is quite simple to understand. 67 00:04:32,021 --> 00:04:38,445 All it does is double the arguments. So if you give it r, it's going to spit 68 00:04:38,445 --> 00:04:42,387 out 2r. So why is that? Well a1(r) by definition, 69 00:04:42,387 --> 00:04:46,072 is just the function a0 applied to r, r times. 70 00:04:46,072 --> 00:04:51,923 A(0), by definition, is the successor function, it adds 1. So, if you apply the 71 00:04:51,923 --> 00:04:57,417 successor function r times, starting from r, you get 2r as a result. 72 00:04:57,417 --> 00:05:02,465 So let's now move from A1 to A2. So let's think about exactly the same 73 00:05:02,465 --> 00:05:05,728 question. Fixing k at 2 and regarding it as a 74 00:05:05,728 --> 00:05:08,722 function of r only. What function is it? 75 00:05:08,722 --> 00:05:23,929 The answer here is C, for every positive integer r, A2(r) = r * 2^r. 76 00:05:25,542 --> 00:05:31,042 And the reasoning is the same as before. So remember A2(r) just means you apply A1 77 00:05:31,042 --> 00:05:34,267 r times. In the last quiz, we discovered that A1 78 00:05:34,267 --> 00:05:39,592 was doubling so if you double r times, it's, it has the effect of multiplying by 79 00:05:39,592 --> 00:05:43,403 2^r factor. So let's take it up a notch yet again. 80 00:05:43,403 --> 00:05:49,240 Let's bump up k to 3, and let's think about A3 but let's not yet think about 81 00:05:49,240 --> 00:05:53,624 what A3 is as a function of r. Let's keep things simple. 82 00:05:53,624 --> 00:05:56,580 What is just a sub 3 evaluated when r = 2. 83 00:05:56,580 --> 00:06:04,564 So k = 3, r = 2, that's a number, what number is it? Okay so the correct answer 84 00:06:04,564 --> 00:06:09,043 is the third one. It's 2048 also known as 2^11. 85 00:06:10,891 --> 00:06:17,256 Let's see why, well A3(2) that just means we just apply A2(2). 86 00:06:18,906 --> 00:06:25,162 [SOUND] Now in the last quiz we evaluated not only A2(2) but A sub 2 of all 87 00:06:25,162 --> 00:06:28,492 integers of r. We discovered it was r * 2^r. 88 00:06:28,492 --> 00:06:34,725 So that means it's a simple matter to evaluate this application of A2 twice. 89 00:06:34,725 --> 00:06:38,921 First A2(2) that's just 2 * 2^2 also known as 8. 90 00:06:38,921 --> 00:06:43,271 Then A2(8) is going to be 8 * 2^8 also known as 2^11 or 2048. 91 00:06:43,271 --> 00:06:48,738 So what about in general? How does the function A3 behave as a function of a 92 00:06:48,738 --> 00:06:53,949 general parameter r? When we answer this question, we're going to see for the 93 00:06:53,949 --> 00:06:58,904 first time, the ridiculously explosive growth that the Ackerman function 94 00:06:58,904 --> 00:07:04,632 exhibits and this will just be the tip of the iceberg, right? This is just when. 95 00:07:04,632 --> 00:07:08,491 k = 3. So by definition, A3(r) you start from r. 96 00:07:08,491 --> 00:07:13,308 You apply the function A2 r times. So let's remind ourselves what the 97 00:07:13,308 --> 00:07:14,610 function A2. is. 98 00:07:14,610 --> 00:07:18,175 We've computed that exactly. That's r * 2^r. 99 00:07:18,175 --> 00:07:22,732 So to make our lives simple, let's just. Think about the 2 to the r part. 100 00:07:22,732 --> 00:07:25,942 So in the real function A2 you also multiply by r. 101 00:07:25,942 --> 00:07:30,676 Lets just think about the 2^r. So imagine you applied that function over 102 00:07:30,676 --> 00:07:35,019 and over and over again, what would you get? Well now you get a tower of 2's, 103 00:07:35,019 --> 00:07:39,048 where the height of the tower is the number of times that you apply that 104 00:07:39,048 --> 00:07:41,918 function. But if you apply a once, you get 2 to the 105 00:07:41,918 --> 00:07:44,121 r. If you apply it twice, you're going to 106 00:07:44,121 --> 00:07:49,152 get 2^2^r. 3 * 2^2^r and so on. 107 00:07:49,152 --> 00:07:54,984 So, our application of A2 gives you something that's even bigger than a tower 108 00:07:54,984 --> 00:07:58,220 of r twos. So let's move on to A4 and this is the 109 00:07:58,220 --> 00:08:03,949 point at which personally my brain begins to hurt, but let's just push a little bit 110 00:08:03,949 --> 00:08:08,302 further, so that we appreciate the ridiculous growth of the Ackerman 111 00:08:08,302 --> 00:08:12,119 function. And as with A3 let's punt for the moment 112 00:08:12,119 --> 00:08:15,799 on understanding the dependence for general r. 113 00:08:15,799 --> 00:08:20,177 Let's just figure out what A4(2) is. So, k = 4, r = 2. 114 00:08:20,177 --> 00:08:25,732 Well, so this by definition you just take 2 and you apply A3. 115 00:08:25,732 --> 00:08:31,179 We computed A3(2) in the, previous quiz. We discovered that was 2,048. 116 00:08:31,179 --> 00:08:34,962 So that's the result of the first application of a 3. 117 00:08:34,962 --> 00:08:38,396 And now we find ourselves applying A3 to 2,048. 118 00:08:38,396 --> 00:08:43,625 So in effect, r now, is 2,048. And remember, we concluded the last slide 119 00:08:43,625 --> 00:08:49,434 by saying, well A3 whatever it is, it's at least as big, as a tower of r 2's. 120 00:08:49,434 --> 00:08:53,647 And here r is 2,000 2,048. So this of course is now the land of 121 00:08:53,647 --> 00:08:56,857 truly ridiculous numbers that we're dealing with. 122 00:08:56,857 --> 00:09:01,692 I encourage you to take a calculator or computer and see how big a tower of 2's 123 00:09:01,692 --> 00:09:05,517 you can even compute. And it's not going to be anywhere close 124 00:09:05,517 --> 00:09:09,492 to 2,000, I can promise you that. And hey, that's just A4(2). 125 00:09:09,492 --> 00:09:14,762 What about the function A4 for a general value of r? Well, of course in general 126 00:09:14,762 --> 00:09:18,762 A4(r) is going to be r applications of A3 starting at r. 127 00:09:18,762 --> 00:09:24,437 Now in the last slide we argued that A3. This function is bounded below by a tower 128 00:09:24,437 --> 00:09:28,087 function. So A3 takes in some input, some integer, 129 00:09:28,087 --> 00:09:32,237 and it spits out some tower with height equal to that integer. 130 00:09:32,237 --> 00:09:37,864 So what happens when you apply that tower function A3 over and over again? Now you 131 00:09:37,864 --> 00:09:40,872 get what's called an iterated tower function. 132 00:09:40,872 --> 00:09:46,149 So when you apply the function A3 for the first time to r. It's going to spit out a 133 00:09:46,149 --> 00:09:49,642 number which is bounded below, by a tower of height r. 134 00:09:49,642 --> 00:09:53,789 Tower of 2s, r of them. Now when you apply a sub 3 a second time, 135 00:09:53,789 --> 00:09:58,547 it's output is going to be a tower of 2 whose height, is lower bounded by, a 136 00:09:58,547 --> 00:10:03,127 tower of 2s of height R, and so on. Every time you apply A3, you're just 137 00:10:03,127 --> 00:10:07,922 going to iterate, this tower function. You will sometimes see the iterated tower 138 00:10:07,922 --> 00:10:10,462 function referred to as a wowzer function. 139 00:10:10,462 --> 00:10:14,532 You probably think I'm pulling your leg, but I'm not kidding you. 140 00:10:14,532 --> 00:10:17,931 You can look it up. So that is the Ackermann function, and a 141 00:10:17,931 --> 00:10:22,351 bunch of examples to appreciate just how ridiculously quickly it is growing. 142 00:10:22,351 --> 00:10:24,957 So now let's move on, to define it's inverse. 143 00:10:24,957 --> 00:10:28,155 Now the Ackermann function has two parameters, k and n, 144 00:10:28,155 --> 00:10:31,239 k and r, excuse me. So to define an inverse, I'm going to 145 00:10:31,239 --> 00:10:35,162 fix, one of those. Specifically, I'm going to fix = 2. 146 00:10:35,162 --> 00:10:40,210 So the inverse Ackermann function is going to be denoted by alpha, for 147 00:10:40,210 --> 00:10:44,959 simplicity I'll define it only for values of n that are at least 4. 148 00:10:44,959 --> 00:10:50,774 And it's defined, for a given value of n, as the smallest k, so that, if you apply 149 00:10:50,774 --> 00:10:54,502 Ak to 2, to r = 2, then the result is at least, n. 150 00:10:55,607 --> 00:10:59,582 So again, it's simple enough to kind of write down a definition like this. 151 00:10:59,582 --> 00:11:03,702 But you really have to plug in some concrete numbers to get a feel for what's 152 00:11:03,702 --> 00:11:06,512 going on. So, let's just write out, you know wat 153 00:11:06,512 --> 00:11:10,587 are the values of n that will give you alpha(n) = 1, that will give you alpha(n) 154 00:11:10,587 --> 00:11:13,822 = 2, alpha(n) = 3, and so on. Okay, so for starters alpha(4) is going 155 00:11:13,822 --> 00:11:17,797 to be equal to 1. Right, so if you applied a 0 2 2, that's 156 00:11:17,797 --> 00:11:22,947 the successor function, so you only get 3, so that's not at least as big as 4. 157 00:11:22,947 --> 00:11:28,622 On the other hand, if you apply A1 to 2, that's the doubling function, so if you 158 00:11:28,622 --> 00:11:33,122 apply it to 2 you get 4. So, A1 does give you a number which is at 159 00:11:33,122 --> 00:11:37,477 least as big as n1 and is 4. So next, I claim that the values of n, 160 00:11:37,477 --> 00:11:41,392 for which alpha of n=2, are the values, 5, 6, 7, and 8. 161 00:11:41,392 --> 00:11:47,062 Why is that true? Well, if you start from 2, and you apply A1, the doubling 162 00:11:47,062 --> 00:11:53,425 function, you only get 4, so A1 is not sufficient to achieve these values of n. 163 00:11:53,425 --> 00:11:59,229 On the other hand, if you apply A2 to 2. Remember, that's the function which given 164 00:11:59,229 --> 00:12:02,664 an r, outputs r * 2^r. So, when you apply it to 2. 165 00:12:02,664 --> 00:12:07,803 You get 2 * 2^2 also known as 8. So A2 is sufficient to map 2 to a number 166 00:12:07,803 --> 00:12:11,620 at least as big as 8. By the same reasoning, recall that we 167 00:12:11,620 --> 00:12:14,987 computed A3(2) and we computed that to be 2,048. 168 00:12:14,987 --> 00:12:19,678 So therefore, for all of the bigger values of n, starting at 9, and going all 169 00:12:19,678 --> 00:12:24,258 the way up to 2,048, their alpha value is going to be equal to 3 because that is 170 00:12:24,258 --> 00:12:29,710 the smallest value of k such that a sub k of 2 is at least those values of n. 171 00:12:29,710 --> 00:12:34,077 And then, things start getting a little bit ridiculous. 172 00:12:34,077 --> 00:12:40,296 So, remember we also lower bounded A4(2). We said that's at least a, tower of 2's, 173 00:12:40,296 --> 00:12:44,207 of height 2048. So for any n up to that value, up to a 174 00:12:44,207 --> 00:12:49,932 tower of 2s of height 2048, alpha, of those n is going to be equal to 4. 175 00:12:49,932 --> 00:12:54,587 This equation implies that the inverse Ackermann Function value of any 176 00:12:54,587 --> 00:12:59,538 imaginable number is at most 4 and I don't know about, I don't know about you 177 00:12:59,538 --> 00:13:04,708 but my brain is already hurting too much to think about which are the values of n 178 00:13:04,708 --> 00:13:09,299 such that alpha is equal to 5. So as a point of contrast, let's do the 179 00:13:09,299 --> 00:13:13,551 same experiment for the log star function, so that we get some 180 00:13:13,551 --> 00:13:18,752 appreciation about how as glacially growing as log star may be, the inverse 181 00:13:18,752 --> 00:13:22,902 Ackermann function is growing still qualitatively slower. 182 00:13:22,902 --> 00:13:26,857 To the log* function, remember, it's the iterated logarithm. 183 00:13:26,857 --> 00:13:31,779 So, it's, starting from n, how many times you needed to hit log in the calculator, 184 00:13:31,779 --> 00:13:34,785 let's say base 2, before the result drops below 1. 185 00:13:34,785 --> 00:13:39,138 So, when this log* of n can be equal to 1, well that's when n is going to be 186 00:13:39,138 --> 00:13:42,194 equal to 2. Then, you hit log once and you drop from 187 00:13:42,194 --> 00:13:46,222 2 to 1 and you're done. If n = 3, or 4, then you need more than 188 00:13:46,222 --> 00:13:50,600 one application of logarithm to drop below 1, but you only need 2 189 00:13:50,600 --> 00:13:54,369 applications, so log* (n) is going to be 2, for n = 3 and 4. 190 00:13:54,369 --> 00:13:59,581 Similarly, for n anywhere between 5, and 16, log* (n) is going to be equal to 3, 191 00:13:59,581 --> 00:14:02,592 right? There would be star from 16, that's 2^4. 192 00:14:02,592 --> 00:14:07,601 So if you apply log once you get 4, a second time you get 2, a third time you 193 00:14:07,601 --> 00:14:12,146 get 1. And you get 1 by analogy the values of n, 194 00:14:12,146 --> 00:14:20,062 for which log*(n) = 4 are going to be starting at 17, and going up to 2^16 also 195 00:14:20,062 --> 00:14:25,604 known as 65,536. So lets just go one more step, for which 196 00:14:25,604 --> 00:14:30,314 n is log*(n) = 5. Well obviously we started at 65,537, and 197 00:14:30,314 --> 00:14:35,753 we end at 2 raised, to the largest number, of the previous block so 2^65536. 198 00:14:35,753 --> 00:14:40,212 So, these numbers are impressive enough on the right hand side. 199 00:14:40,212 --> 00:14:45,448 There is no imaginable number of importance, for which log* is going to be 200 00:14:45,448 --> 00:14:48,680 bigger than 5. Looking at the left and the right hand 201 00:14:48,680 --> 00:14:52,941 sides though, we see that there really is a qualitative difference, between the 202 00:14:52,941 --> 00:14:56,945 rate of growth of these 2 functions. With the inverse Ackermann function in 203 00:14:56,945 --> 00:15:00,054 fact, growing much, much slower, than the log* function. 204 00:15:00,054 --> 00:15:04,266 So perhaps the easiest way to see that, is to look at the right hand side, and on 205 00:15:04,266 --> 00:15:08,478 each of the lines in the right hand side, look at the largest value of n, so that 206 00:15:08,478 --> 00:15:11,778 log* is a given number. So for example, on the third line, the 207 00:15:11,778 --> 00:15:15,570 largest n, such that log* of n = 3 is a tower of 2 of height 3. 208 00:15:15,570 --> 00:15:20,548 2^2^2 also known as 16. Similarly on the fourth and fifth line, 209 00:15:20,548 --> 00:15:26,233 the largest n that gives value of log* = 4 and 5 are tower of 2 of height 4 and 210 00:15:26,233 --> 00:15:30,477 the tower of 2 of height 5. On the other hand, look at the fourth 211 00:15:30,477 --> 00:15:34,909 line on the left hand side. So already when we ask, about the largest 212 00:15:34,909 --> 00:15:39,964 values of n that have inverse Ackermann value 4, we're talking about towers of 213 00:15:39,964 --> 00:15:41,989 twos. In fact, of height 2048. 214 00:15:41,989 --> 00:15:47,773 So that is, the n which give us alpha(n) = 4, we would have to write down 2048 215 00:15:47,773 --> 00:15:53,255 lines on the right-hand side before we had values of n that were equally big. 216 00:15:53,255 --> 00:15:57,982 This indicates how the log* function, while growing glacially indeed, is 217 00:15:57,982 --> 00:16:01,757 nevertheless moving at light speed compared to the inverse Ackermann 218 00:16:01,757 --> 00:16:01,757 function.