In this optional video, I'm going to give 
you a glimpse of the research frontier on 
the problem of computing Minimum Cost, 
Spanning Trees. 
We've now seen two excellent algorithms 
for this problem. 
Prim's algorithm and Kruskal's algorithm. 
And we had suitable data structures both 
running near linear time. 
Big O of M Log N, where M is the number 
of edges, 
N is the number of vertices. 
Now going ahead we should be pretty happy 
with those algorithms. 
We're only a log factor slower than what 
it takes to read the input. 
But again, the good algorithm designer 
should never be content, 
never be complacent. 
Should always ask can we do better? 
Maybe we do even better than M Log N. 
Well, believe it or not, you can do 
better than these implementations. 
At least in principle, 
at least, in theory. 
You have to work pretty hard and I'm 
definitely not going to discuss the 
algorithms or the analysis that prove 
this fact. 
But let me just mention a couple 
references that give MST algorithms with 
asymptotic run times even better than M 
Log N. 
So quite shockingly, if you're happy with 
randomized algorithms, as we were with 
say, the quit sort sorting algorithm. 
Then the minimum cost spanning tree 
problem can be solved in linear time. 
That's obviously an optimal algorithm. 
You have to look at the whole graph to 
compute the minimum cost spanning tree. 
But you can solve the problem in time a 
constant factor, larger than what it 
takes to read the input. 
This algorithm is much, much newer than 
Prim and Kruskal's algorithm, as you 
might expect. 
It does date back to the 20th century, 
but definitely the latter stages of last 
century. 
So the names of a couple of the inventors 
of this algorithm will already be 
familiar to those of you that paid close 
attention in part one. 
so the Carver here is the same as the 
author of the randomized and cut 
algorithm you studied in part one. 
Tarjan's name is coming up a couple times 
in these courses, but for example when we 
discuss strongly connected component 
algorithms. 
So, what if you're not happy with the 
bound just on the bound just on the 
expected running time, with the 
expectation over the coin flips of the 
algorithm? 
What if you wanted a deterministic 
algorithm? 
So, if we think of this randomized 
algorithm as being analogous to quick 
sort in the sorting problem, what would 
be the analog of merge sort? 
Well, it turns out we do not know whether 
or not there is a linear time 
deterministic algorithm for minimum 
spanning trees, 
that's an open question. 
You might think, here at this, late date, 
2012, 2013 we'd know everything there is 
to know about minimum cost spanning 
trees. 
We do not know, if there exists a linear 
time deterministic algorithm. 
We do know that there's a deterministic 
algorithm, with a running time, absurdly 
close to linear. 
Specifically, there's an algorithm with 
running time M, 
the number of edges times alpha of N. 
So what, you ask, is alpha of N? 
What is this function? 
Well, it's something called the inverse 
Ackermann function. 
So, defining the Ackermann function and 
its inverse actually takes a little bit 
of work. 
So I'm not going to do it right now. 
For those of you that are curious, you 
should check out the optional material on 
the union find data structure. 
Which is yet another context where sort 
of unbelievably, this inverse Ackermann 
function arises. 
But for the present context, what you 
should is Is that this is a crazy slow 
growing function. 
It's not constant. 
For any number C, I can exhibit a large 
enough value of N, so that this function 
values is lease C. 
So that alpha of N is a lease C. 
So, it's not bounded by a constant, 
but it's mind boggling how slow growing 
this function is. 
Let me actually just give you an 
incredibly slow growing function which 
actually goes much faster than the 
inverse Ackermann, just to prove the 
point. 
Specifically, the inverse Ackermann 
function grows quite a bit slower than 
log star N. 
Log star N I can define for you easily. 
We recall our definition of log base two, 
way back when we first demystified 
logarithms at the beginning of part one. 
If you type N into your calculator, and 
you keep dividing by two, you count the 
number of times you divide by two until 
you drop below one, 
that's log based two of N. 
For log star, instead of dividing by two, 
you're going to hit the log button on 
your calculator, 
ad you count how many times you hit log, 
until, the result drops below one. 
That number, the number of times you hit 
log, is defined to be Log star of N. 
The log star function as the inverse of 
the function which is a tower of 2s. 
The function which takes as input you 
know, an integer say T and raises two to 
itself T times. 
So, to appreciate just how slow growing 
the log star function is, let alone the 
inverse Ackerman function. 
What you should do is type in the biggest 
number you can into your nearest 
calculator or computer. 
Just keep typing in nines as long as you 
can. 
Then go ahead and evaluate log star. 
Keep applying log function till it drops 
below one. 
Probably, the result's going to be 
something in the neighborhood of five. 
So log star of the biggest number in your 
calculator or computer is going to be 
five. 
That's how, that's how slow growing this 
function is. 
So, the upshot is that the algorithms 
community has the time complexity of the 
MST problem almost nailed, but not quite, 
in the deterministic case. 
The right answer is somewhere between M 
and M times inverse Ackermann function, 
and we don't know where. 
But, you know what? 
It gets weirder. 
So check out the following result of 
Pettie and Ramachandran. 
They in a sense solve, the deterministic 
minimum cost spanning tree problem by 
exhibiting, an algorithm, who's time 
complexity is optimal. So they gave an 
algorithm and they gave a proof, just in 
the spirit of what we did for sorting. 
They gave a proof that no algorithm could 
have running time asymptotically better 
than theirs. But what they didn't do, is 
explicitly evaluate what the time 
complexity of their algorithm is. 
So they managed to prove optimality, 
without actually evaluating exactly 
what's its running time. 
So it's certainly somewhere between 
linear and linear times inverse 
Ackermannn. 
We know that's the true time complexity 
of the problem. 
We know an algorithm that achieves an 
optimal complexity. 
But to this day we do not know what that 
optimal time complexity actually is, as a 
function of the graph size. 
So those are some of the most advanced 
things that we do know, about the minimum 
cost spanning tree problem. 
Let me just mention a couple of things 
that we still, to this day, do not know. 
So let me start with the randomized 
algorithms. 
Now, maybe you're reaction is, there's no 
open questions in the randomized 
algorithms world, because we know you 
need linear time to solve the problem. 
And I've told you that there is a 
randomized algorithm with expected 
running time that is linear. 
So what more could you want? 
Well, what I want is I want an algorithm 
that's not just linear time but also 
simple enough that I can teach it to 
other people. 
So ideally, it would be another graduate 
course like this. But I was actually very 
happy to have a randomized algorithm, 
linear time, simple enough to cover in a 
graduate course. 
The current linear time algorithms do not 
have that property. 
They're more complicated than I can even 
cover in a graduate course. 
To accomplish this task, it turns out to 
be enough to solve a seemingly simpler 
task, namely, to get a simple randomized 
linear time algorithm for the MST 
verification problem. 
So let me tell you what that means. 
So in a MST problem you're supposed to 
optimize amongst all exponentially 
minimum spanning trees. 
You got to find me the one with the 
smallest sum of edge cost. 
In MST verification, the job seems 
simpler. 
I'm actually going to hand you, a 
candidate MST or I'm going to hand you a 
spanning tree. 
It may or may not be the best one and you 
just need to check, is it the optimal one 
or not. 
Furthermore, in the event that it's not 
optimal you should tell me edges that are 
not in the spanning tree. 
Edges that are too expensive that I 
should throw out. 
The reason it's enough to design a linear 
time algorithm for this seemingly simpler 
problem, is that the content of the 
Karger-Klein-Tarjan paper, is a 
reduction. 
A randomized reduction from optimizing 
over spanning trees, to the seemingly 
simpler problem of MST verification. 
Moreover, all of the novel content in the 
Karger-Klein-Tarjan algorithm. 
It's linear time, it's randomized, and it 
is really simple. 
I teach this stuff in that paper in my 
graduate classes. 
But it needs this MST verification 
subroutine as a black box. 
And the only known implementations that 
are linear time for MST verification are 
quite complicated. 
So, find a simple way to do MST 
verification that runs in linear time. 
And you're good to go with your simple 
optimal MST algorithm. 
For deterministic algorithms, the holy 
grail is obvious. 
We'd love to have a deterministic 
algorithm for MST that runs in linear 
time. 
Or at the very least, we just need to 
figure out, you know, what is the best 
possible time complexity of any 
deterministic MST algorithm. 
So one of the takeaways of this 
discussion is that, you know? 
For all the amazing advances by computer 
scientists on the design and analysis of 
algorithms over the past 50 years or so. 
There are still totally fundamental 
things that we do not understand. 
So there's still great ideas to come in 
the future. 
So if you've been intrigued by some of 
the things that I've said in this video. 
And you want to learn more about these 
advanced minimum cost spanning tree 
algorithms. 
For further reading, I'd recommend a 
survey by Jason Eisner, called State of 
the Art Minimum Spanning Tree Algorithms. 
It's about fifteen years old now. 
But it's still an amazing resource for 
learning about this advanced material.