1
00:00:00,012 --> 00:00:04,460
The Hopcroft-Ullman guarantees states 
that if you implement a union-find data 

2
00:00:04,460 --> 00:00:08,972
structure, using lazy unions, union by 
rank, and path compression, then you get 

3
00:00:08,972 --> 00:00:12,763
a pretty amazing guarantee. 
Do any sequence you want, of m union and 

4
00:00:12,763 --> 00:00:15,783
find operations. 
The total amount of work by this data 

5
00:00:15,783 --> 00:00:19,172
structure is bounded above by big O(m) * 
log* of n. 

6
00:00:19,172 --> 00:00:22,222
Where n is the number of objects in the 
data structure. 

7
00:00:22,222 --> 00:00:26,339
So, on the one hand, I hope you're 
suitably impressed by these results. 

8
00:00:26,339 --> 00:00:30,972
So, the proof of it that we gave on the 
last video is frankly, totally brilliant. 

9
00:00:30,972 --> 00:00:33,843
And secondly, the guarantee itself is 
excellent. 

10
00:00:33,843 --> 00:00:38,482
What, with log star of n being bounded 
above by 5 for any imaginable value of n. 

11
00:00:38,482 --> 00:00:43,379
On the other hand, I do hope there's sort 
of, a little voice inside you, wondering 

12
00:00:43,379 --> 00:00:45,980
if this log* bound could really be 
optimal. 

13
00:00:45,980 --> 00:00:50,381
Can we do better, maybe by a sharper 
analsysis of the data structure we've 

14
00:00:50,381 --> 00:00:55,312
been discussing, or perhaps by adding 
further injinuity, further optimizations 

15
00:00:55,312 --> 00:00:59,027
To the data structure. 
For all we know, you might be able to get 

16
00:00:59,027 --> 00:01:03,342
away with a linear amount of work. 
So O(m) work to process an arbitrary 

17
00:01:03,342 --> 00:01:07,357
sequence of fines and unions. 
It turns out you can do better than the 

18
00:01:07,357 --> 00:01:11,362
Hopcroft-Ullman guarantee. 
There is a sharper analysis of the exact 

19
00:01:11,362 --> 00:01:14,822
same data structure. 
That analysis was given by Tarjan and 

20
00:01:14,822 --> 00:01:20,223
here is the statement of his guarantee. 
So, the improved bound states that for an 

21
00:01:20,223 --> 00:01:25,733
arbitrary sequence of m union and find 
operations, the total work done by this 

22
00:01:25,733 --> 00:01:31,006
data structure union by rank and path 
compression is O(m) * alpha(n). 

23
00:01:31,006 --> 00:01:37,120
Now, what you may ask is alpha(n), that's 
a function known of the inverse Ackermann 

24
00:01:37,120 --> 00:01:40,209
function. 
So what, then, is the inverse Ackermann 

25
00:01:40,209 --> 00:01:45,190
function? Well, the short answer is that 
it's a function that, incredibly, grows 

26
00:01:45,190 --> 00:01:50,380
still more slowly, in fact, much, much, 
much more slowly, than the log* function 

27
00:01:50,380 --> 00:01:52,992
we discussed in the Hopcroft-Ullman 
bound. 

28
00:01:52,992 --> 00:01:57,655
The precise definition is nontrivial, and 
that is the subject of this video. 

29
00:01:57,655 --> 00:02:00,632
In the next video, we'll prove this 
theorem. 

30
00:02:00,632 --> 00:02:04,389
And explain how the Inverse Ackerman 
function arises in an optimal analysis of 

31
00:02:04,389 --> 00:02:07,911
union by rank with path compression. 
So, I'll first define the Ackermann 

32
00:02:07,911 --> 00:02:10,164
function and then I'll describe its 
inverse. 

33
00:02:10,164 --> 00:02:13,942
one thing I should warn you, is you will 
see slight variations on these two 

34
00:02:13,942 --> 00:02:17,570
definitions out there in literature. 
Different authors define them in ways 

35
00:02:17,570 --> 00:02:20,988
convenient for their own purposes. 
That's also what I'm going to do here, 

36
00:02:20,988 --> 00:02:24,119
I'm going to take one particular 
particular definition. 

37
00:02:24,119 --> 00:02:27,258
Convenient for the union find data 
structure analysis. 

38
00:02:27,258 --> 00:02:31,787
All of the variance that you'll see out 
there in the literature, exhibit roughly 

39
00:02:31,787 --> 00:02:35,596
the same ridiculous growth. 
So the details aren't really important 

40
00:02:35,596 --> 00:02:39,194
for the asymptotic analysis of data 
structures and algorithms. 

41
00:02:39,194 --> 00:02:43,202
So you can think of the Ackermann 
function as having two arguments, which 

42
00:02:43,202 --> 00:02:46,975
I'm going to denote by k and r. 
Both of these are integers, k should be 

43
00:02:46,975 --> 00:02:52,157
at least 0, r should be be at least 1. 
The Ackermann function is defined 

44
00:02:52,157 --> 00:02:55,237
recursively. 
The base case is when k = 0. 

45
00:02:55,237 --> 00:02:59,887
For all r, we define A0(r) as the 
successor function. 

46
00:02:59,887 --> 00:03:03,802
That is it takes its input r and it 
outputs r + 1. 

47
00:03:03,802 --> 00:03:11,223
In general, when k is strictly positive, 
the argument r tells you how many times 

48
00:03:11,223 --> 00:03:18,482
to apply the operator, the function, ak - 
1 starting from the input r. 

49
00:03:18,482 --> 00:03:26,109
That is, it's the r fold composition, of 
Ak - 1 and again applied to the argument 

50
00:03:26,109 --> 00:03:29,317
r. 
So in some sense describing the Ackermann 

51
00:03:29,317 --> 00:03:32,716
function isn't that hard. 
Right, I didn't really need, you know 

52
00:03:32,716 --> 00:03:35,872
that much of a slide to actually write 
down its description. 

53
00:03:35,872 --> 00:03:39,439
But getting a feel for what on earth this 
function is takes some work. 

54
00:03:39,439 --> 00:03:43,327
So the first thing maybe just as a sanity 
check is note that it is a well defined 

55
00:03:43,327 --> 00:03:45,777
function. 
So, you know, you're all programmers so 

56
00:03:45,777 --> 00:03:49,302
you could easily imagine writing a 
program which took as input, k and r and 

57
00:03:49,302 --> 00:03:52,402
at least, in principle, given enough 
time, computed this number. 

58
00:03:52,402 --> 00:03:55,577
It would be a very simple recursive 
algorithm with a pseudo code just 

59
00:03:55,577 --> 00:04:00,452
following the mathematic definition. 
So it is some function, given k and r 

60
00:04:00,452 --> 00:04:04,696
there is some number that's the result of 
this definition. 

61
00:04:04,696 --> 00:04:09,956
Now in the next sequence of quizzes, 
let's get a feel for exactly how this 

62
00:04:09,956 --> 00:04:14,225
function behaves. 
And so let's start on the first quiz just 

63
00:04:14,225 --> 00:04:19,641
by fixing k to be 1 and now viewing the 
Ackerman function with k fixed at 1 is a 

64
00:04:19,641 --> 00:04:24,517
function purely of r. 
So, what function does A1 of r correspond 

65
00:04:24,517 --> 00:04:28,191
to? 
Alright, so the correct answer is B. 

66
00:04:28,191 --> 00:04:32,021
At least A1 is quite simple to 
understand. 

67
00:04:32,021 --> 00:04:38,445
All it does is double the arguments. 
So if you give it r, it's going to spit 

68
00:04:38,445 --> 00:04:42,387
out 2r. 
So why is that? Well a1(r) by definition, 

69
00:04:42,387 --> 00:04:46,072
is just the function a0 applied to r, r 
times. 

70
00:04:46,072 --> 00:04:51,923
A(0), by definition, is the successor 
function, it adds 1. So, if you apply the 

71
00:04:51,923 --> 00:04:57,417
successor function r times, starting from 
r, you get 2r as a result. 

72
00:04:57,417 --> 00:05:02,465
So let's now move from A1 to A2. 
So let's think about exactly the same 

73
00:05:02,465 --> 00:05:05,728
question. 
Fixing k at 2 and regarding it as a 

74
00:05:05,728 --> 00:05:08,722
function of r only. 
What function is it? 

75
00:05:08,722 --> 00:05:23,929
The answer here is C, for every positive 
integer r, A2(r) = r * 2^r. 

76
00:05:25,542 --> 00:05:31,042
And the reasoning is the same as before. 
So remember A2(r) just means you apply A1 

77
00:05:31,042 --> 00:05:34,267
r times. 
In the last quiz, we discovered that A1 

78
00:05:34,267 --> 00:05:39,592
was doubling so if you double r times, 
it's, it has the effect of multiplying by 

79
00:05:39,592 --> 00:05:43,403
2^r factor. 
So let's take it up a notch yet again. 

80
00:05:43,403 --> 00:05:49,240
Let's bump up k to 3, and let's think 
about A3 but let's not yet think about 

81
00:05:49,240 --> 00:05:53,624
what A3 is as a function of r. 
Let's keep things simple. 

82
00:05:53,624 --> 00:05:56,580
What is just a sub 3 evaluated when r = 
2. 

83
00:05:56,580 --> 00:06:04,564
So k = 3, r = 2, that's a number, what 
number is it? Okay so the correct answer 

84
00:06:04,564 --> 00:06:09,043
is the third one. 
It's 2048 also known as 2^11. 

85
00:06:10,891 --> 00:06:17,256
Let's see why, well A3(2) that just means 
we just apply A2(2). 

86
00:06:18,906 --> 00:06:25,162
[SOUND] Now in the last quiz we evaluated 
not only A2(2) but A sub 2 of all 

87
00:06:25,162 --> 00:06:28,492
integers of r. 
We discovered it was r * 2^r. 

88
00:06:28,492 --> 00:06:34,725
So that means it's a simple matter to 
evaluate this application of A2 twice. 

89
00:06:34,725 --> 00:06:38,921
First A2(2) that's just 2 * 2^2 also 
known as 8. 

90
00:06:38,921 --> 00:06:43,271
Then A2(8) is going to be 8 * 2^8 also 
known as 2^11 or 2048. 

91
00:06:43,271 --> 00:06:48,738
So what about in general? How does the 
function A3 behave as a function of a 

92
00:06:48,738 --> 00:06:53,949
general parameter r? When we answer this 
question, we're going to see for the 

93
00:06:53,949 --> 00:06:58,904
first time, the ridiculously explosive 
growth that the Ackerman function 

94
00:06:58,904 --> 00:07:04,632
exhibits and this will just be the tip of 
the iceberg, right? This is just when. 

95
00:07:04,632 --> 00:07:08,491
k = 3. 
So by definition, A3(r) you start from r. 

96
00:07:08,491 --> 00:07:13,308
You apply the function A2 r times. 
So let's remind ourselves what the 

97
00:07:13,308 --> 00:07:14,610
function A2. 
is. 

98
00:07:14,610 --> 00:07:18,175
We've computed that exactly. 
That's r * 2^r. 

99
00:07:18,175 --> 00:07:22,732
So to make our lives simple, let's just. 
Think about the 2 to the r part. 

100
00:07:22,732 --> 00:07:25,942
So in the real function A2 you also 
multiply by r. 

101
00:07:25,942 --> 00:07:30,676
Lets just think about the 2^r. 
So imagine you applied that function over 

102
00:07:30,676 --> 00:07:35,019
and over and over again, what would you 
get? Well now you get a tower of 2's, 

103
00:07:35,019 --> 00:07:39,048
where the height of the tower is the 
number of times that you apply that 

104
00:07:39,048 --> 00:07:41,918
function. 
But if you apply a once, you get 2 to the 

105
00:07:41,918 --> 00:07:44,121
r. 
If you apply it twice, you're going to 

106
00:07:44,121 --> 00:07:49,152
get 2^2^r. 
3 * 2^2^r and so on. 

107
00:07:49,152 --> 00:07:54,984
So, our application of A2 gives you 
something that's even bigger than a tower 

108
00:07:54,984 --> 00:07:58,220
of r twos. 
So let's move on to A4 and this is the 

109
00:07:58,220 --> 00:08:03,949
point at which personally my brain begins 
to hurt, but let's just push a little bit 

110
00:08:03,949 --> 00:08:08,302
further, so that we appreciate the 
ridiculous growth of the Ackerman 

111
00:08:08,302 --> 00:08:12,119
function. 
And as with A3 let's punt for the moment 

112
00:08:12,119 --> 00:08:15,799
on understanding the dependence for 
general r. 

113
00:08:15,799 --> 00:08:20,177
Let's just figure out what A4(2) is. 
So, k = 4, r = 2. 

114
00:08:20,177 --> 00:08:25,732
Well, so this by definition you just take 
2 and you apply A3. 

115
00:08:25,732 --> 00:08:31,179
We computed A3(2) in the, previous quiz. 
We discovered that was 2,048. 

116
00:08:31,179 --> 00:08:34,962
So that's the result of the first 
application of a 3. 

117
00:08:34,962 --> 00:08:38,396
And now we find ourselves applying A3 to 
2,048. 

118
00:08:38,396 --> 00:08:43,625
So in effect, r now, is 2,048. 
And remember, we concluded the last slide 

119
00:08:43,625 --> 00:08:49,434
by saying, well A3 whatever it is, it's 
at least as big, as a tower of r 2's. 

120
00:08:49,434 --> 00:08:53,647
And here r is 2,000 2,048. 
So this of course is now the land of 

121
00:08:53,647 --> 00:08:56,857
truly ridiculous numbers that we're 
dealing with. 

122
00:08:56,857 --> 00:09:01,692
I encourage you to take a calculator or 
computer and see how big a tower of 2's 

123
00:09:01,692 --> 00:09:05,517
you can even compute. 
And it's not going to be anywhere close 

124
00:09:05,517 --> 00:09:09,492
to 2,000, I can promise you that. 
And hey, that's just A4(2). 

125
00:09:09,492 --> 00:09:14,762
What about the function A4 for a general 
value of r? Well, of course in general 

126
00:09:14,762 --> 00:09:18,762
A4(r) is going to be r applications of A3 
starting at r. 

127
00:09:18,762 --> 00:09:24,437
Now in the last slide we argued that A3. 
This function is bounded below by a tower 

128
00:09:24,437 --> 00:09:28,087
function. 
So A3 takes in some input, some integer, 

129
00:09:28,087 --> 00:09:32,237
and it spits out some tower with height 
equal to that integer. 

130
00:09:32,237 --> 00:09:37,864
So what happens when you apply that tower 
function A3 over and over again? Now you 

131
00:09:37,864 --> 00:09:40,872
get what's called an iterated tower 
function. 

132
00:09:40,872 --> 00:09:46,149
So when you apply the function A3 for the 
first time to r. It's going to spit out a 

133
00:09:46,149 --> 00:09:49,642
number which is bounded below, by a tower 
of height r. 

134
00:09:49,642 --> 00:09:53,789
Tower of 2s, r of them. 
Now when you apply a sub 3 a second time, 

135
00:09:53,789 --> 00:09:58,547
it's output is going to be a tower of 2 
whose height, is lower bounded by, a 

136
00:09:58,547 --> 00:10:03,127
tower of 2s of height R, and so on. 
Every time you apply A3, you're just 

137
00:10:03,127 --> 00:10:07,922
going to iterate, this tower function. 
You will sometimes see the iterated tower 

138
00:10:07,922 --> 00:10:10,462
function referred to as a wowzer 
function. 

139
00:10:10,462 --> 00:10:14,532
You probably think I'm pulling your leg, 
but I'm not kidding you. 

140
00:10:14,532 --> 00:10:17,931
You can look it up. 
So that is the Ackermann function, and a 

141
00:10:17,931 --> 00:10:22,351
bunch of examples to appreciate just how 
ridiculously quickly it is growing. 

142
00:10:22,351 --> 00:10:24,957
So now let's move on, to define it's 
inverse. 

143
00:10:24,957 --> 00:10:28,155
Now the Ackermann function has two 
parameters, k and n, 

144
00:10:28,155 --> 00:10:31,239
k and r, excuse me. 
So to define an inverse, I'm going to 

145
00:10:31,239 --> 00:10:35,162
fix, one of those. 
Specifically, I'm going to fix = 2. 

146
00:10:35,162 --> 00:10:40,210
So the inverse Ackermann function is 
going to be denoted by alpha, for 

147
00:10:40,210 --> 00:10:44,959
simplicity I'll define it only for values 
of n that are at least 4. 

148
00:10:44,959 --> 00:10:50,774
And it's defined, for a given value of n, 
as the smallest k, so that, if you apply 

149
00:10:50,774 --> 00:10:54,502
Ak to 2, to r = 2, then the result is at 
least, n. 

150
00:10:55,607 --> 00:10:59,582
So again, it's simple enough to kind of 
write down a definition like this. 

151
00:10:59,582 --> 00:11:03,702
But you really have to plug in some 
concrete numbers to get a feel for what's 

152
00:11:03,702 --> 00:11:06,512
going on. 
So, let's just write out, you know wat 

153
00:11:06,512 --> 00:11:10,587
are the values of n that will give you 
alpha(n) = 1, that will give you alpha(n) 

154
00:11:10,587 --> 00:11:13,822
= 2, alpha(n) = 3, and so on. 
Okay, so for starters alpha(4) is going 

155
00:11:13,822 --> 00:11:17,797
to be equal to 1. 
Right, so if you applied a 0 2 2, that's 

156
00:11:17,797 --> 00:11:22,947
the successor function, so you only get 
3, so that's not at least as big as 4. 

157
00:11:22,947 --> 00:11:28,622
On the other hand, if you apply A1 to 2, 
that's the doubling function, so if you 

158
00:11:28,622 --> 00:11:33,122
apply it to 2 you get 4. 
So, A1 does give you a number which is at 

159
00:11:33,122 --> 00:11:37,477
least as big as n1 and is 4. 
So next, I claim that the values of n, 

160
00:11:37,477 --> 00:11:41,392
for which alpha of n=2, are the values, 
5, 6, 7, and 8. 

161
00:11:41,392 --> 00:11:47,062
Why is that true? Well, if you start from 
2, and you apply A1, the doubling 

162
00:11:47,062 --> 00:11:53,425
function, you only get 4, so A1 is not 
sufficient to achieve these values of n. 

163
00:11:53,425 --> 00:11:59,229
On the other hand, if you apply A2 to 2. 
Remember, that's the function which given 

164
00:11:59,229 --> 00:12:02,664
an r, outputs r * 2^r. 
So, when you apply it to 2. 

165
00:12:02,664 --> 00:12:07,803
You get 2 * 2^2 also known as 8. 
So A2 is sufficient to map 2 to a number 

166
00:12:07,803 --> 00:12:11,620
at least as big as 8. 
By the same reasoning, recall that we 

167
00:12:11,620 --> 00:12:14,987
computed A3(2) and we computed that to be 
2,048. 

168
00:12:14,987 --> 00:12:19,678
So therefore, for all of the bigger 
values of n, starting at 9, and going all 

169
00:12:19,678 --> 00:12:24,258
the way up to 2,048, their alpha value is 
going to be equal to 3 because that is 

170
00:12:24,258 --> 00:12:29,710
the smallest value of k such that a sub k 
of 2 is at least those values of n. 

171
00:12:29,710 --> 00:12:34,077
And then, things start getting a little 
bit ridiculous. 

172
00:12:34,077 --> 00:12:40,296
So, remember we also lower bounded A4(2). 
We said that's at least a, tower of 2's, 

173
00:12:40,296 --> 00:12:44,207
of height 2048. 
So for any n up to that value, up to a 

174
00:12:44,207 --> 00:12:49,932
tower of 2s of height 2048, alpha, of 
those n is going to be equal to 4. 

175
00:12:49,932 --> 00:12:54,587
This equation implies that the inverse 
Ackermann Function value of any 

176
00:12:54,587 --> 00:12:59,538
imaginable number is at most 4 and I 
don't know about, I don't know about you 

177
00:12:59,538 --> 00:13:04,708
but my brain is already hurting too much 
to think about which are the values of n 

178
00:13:04,708 --> 00:13:09,299
such that alpha is equal to 5. 
So as a point of contrast, let's do the 

179
00:13:09,299 --> 00:13:13,551
same experiment for the log star 
function, so that we get some 

180
00:13:13,551 --> 00:13:18,752
appreciation about how as glacially 
growing as log star may be, the inverse 

181
00:13:18,752 --> 00:13:22,902
Ackermann function is growing still 
qualitatively slower. 

182
00:13:22,902 --> 00:13:26,857
To the log* function, remember, it's the 
iterated logarithm. 

183
00:13:26,857 --> 00:13:31,779
So, it's, starting from n, how many times 
you needed to hit log in the calculator, 

184
00:13:31,779 --> 00:13:34,785
let's say base 2, before the result drops 
below 1. 

185
00:13:34,785 --> 00:13:39,138
So, when this log* of n can be equal to 
1, well that's when n is going to be 

186
00:13:39,138 --> 00:13:42,194
equal to 2. 
Then, you hit log once and you drop from 

187
00:13:42,194 --> 00:13:46,222
2 to 1 and you're done. 
If n = 3, or 4, then you need more than 

188
00:13:46,222 --> 00:13:50,600
one application of logarithm to drop 
below 1, but you only need 2 

189
00:13:50,600 --> 00:13:54,369
applications, so log* (n) is going to be 
2, for n = 3 and 4. 

190
00:13:54,369 --> 00:13:59,581
Similarly, for n anywhere between 5, and 
16, log* (n) is going to be equal to 3, 

191
00:13:59,581 --> 00:14:02,592
right? There would be star from 16, 
that's 2^4. 

192
00:14:02,592 --> 00:14:07,601
So if you apply log once you get 4, a 
second time you get 2, a third time you 

193
00:14:07,601 --> 00:14:12,146
get 1. 
And you get 1 by analogy the values of n, 

194
00:14:12,146 --> 00:14:20,062
for which log*(n) = 4 are going to be 
starting at 17, and going up to 2^16 also 

195
00:14:20,062 --> 00:14:25,604
known as 65,536. 
So lets just go one more step, for which 

196
00:14:25,604 --> 00:14:30,314
n is log*(n) = 5. 
Well obviously we started at 65,537, and 

197
00:14:30,314 --> 00:14:35,753
we end at 2 raised, to the largest 
number, of the previous block so 2^65536. 

198
00:14:35,753 --> 00:14:40,212
So, these numbers are impressive enough 
on the right hand side. 

199
00:14:40,212 --> 00:14:45,448
There is no imaginable number of 
importance, for which log* is going to be 

200
00:14:45,448 --> 00:14:48,680
bigger than 5. 
Looking at the left and the right hand 

201
00:14:48,680 --> 00:14:52,941
sides though, we see that there really is 
a qualitative difference, between the 

202
00:14:52,941 --> 00:14:56,945
rate of growth of these 2 functions. 
With the inverse Ackermann function in 

203
00:14:56,945 --> 00:15:00,054
fact, growing much, much slower, than the 
log* function. 

204
00:15:00,054 --> 00:15:04,266
So perhaps the easiest way to see that, 
is to look at the right hand side, and on 

205
00:15:04,266 --> 00:15:08,478
each of the lines in the right hand side, 
look at the largest value of n, so that 

206
00:15:08,478 --> 00:15:11,778
log* is a given number. 
So for example, on the third line, the 

207
00:15:11,778 --> 00:15:15,570
largest n, such that log* of n = 3 is a 
tower of 2 of height 3. 

208
00:15:15,570 --> 00:15:20,548
2^2^2 also known as 16. 
Similarly on the fourth and fifth line, 

209
00:15:20,548 --> 00:15:26,233
the largest n that gives value of log* = 
4 and 5 are tower of 2 of height 4 and 

210
00:15:26,233 --> 00:15:30,477
the tower of 2 of height 5. 
On the other hand, look at the fourth 

211
00:15:30,477 --> 00:15:34,909
line on the left hand side. 
So already when we ask, about the largest 

212
00:15:34,909 --> 00:15:39,964
values of n that have inverse Ackermann 
value 4, we're talking about towers of 

213
00:15:39,964 --> 00:15:41,989
twos. 
In fact, of height 2048. 

214
00:15:41,989 --> 00:15:47,773
So that is, the n which give us alpha(n) 
= 4, we would have to write down 2048 

215
00:15:47,773 --> 00:15:53,255
lines on the right-hand side before we 
had values of n that were equally big. 

216
00:15:53,255 --> 00:15:57,982
This indicates how the log* function, 
while growing glacially indeed, is 

217
00:15:57,982 --> 00:16:01,757
nevertheless moving at light speed 
compared to the inverse Ackermann 

218
00:16:01,757 --> 00:16:01,757
function.