So now we understand why Kruskal's algorithm is correct, why it always computes a Minimum cost-spanning Tree. In this video, we'll turn our attention to implementation issues. We'll begin with a straightforward implementation of Kruskal's algorithm. That'll give us a polynomial run time bound which is good but we'd like to do better. So then we'll show how deploying a suitable data structure, something you haven't seen before, the Union-Find data structure, allows us to speed up Kruskal's Algorithm to be competitive with Prim's Algorithm. That is we'll get a near linear running time bound of MlogN. So let's just briefly review the very elegant pseudocode for Kruskal's Algorithm, so it's a greedy algorithm. It considers the cheapest edges first, all the way up to the most expensive. So we begin with a sorting pre-processing step to put the edges in sorted order for notational convenience let's just rename the edges. So, that one is the cheapest edge, and, all the way up to M being the most expensive edge. We then have our single linear scan in this for loop, and we just grab edges whenever we can, okay? So we maintain this evolving set capital T, which at the end of the algorithm will be our spanning tree. Now, what forces us to exclude an edge from this set capital T? Well if it creates a cycle with the edges was already chosen, obviously that's a no go. We can't have cycles in our final output, but as long as we don't have a cycle from including an edge, we go ahead and optimistically include it. And as we've seen, this is a correct algorithm, it always outputs the minimum cost spanning tree. So, what would be the running time of this algorithm? If we just straightforwardly implement the pseudocode on this slide? Well, let's just considered the algorithm step by step. In the first step, we sort the edges, so that's going to take M log N time. Now don't forget whenever we're speaking about graphs, we have the convention that M denotes the number of edges and N denotes the number of vertices. So, you might justifiably wonder why I wrote M log N for the running time of the sorting step instead of M log M, since after all what it is we're sorting are the edges and there's M of them. Well, what I'm using here is that, in this context we can switch log N and log M interchangeably with each other inside a big-O notation. Why is that true? Well recall when we first discussed graphs in part one, we noticed that there can't be too many edges. So the number of edges M, is the most quadratic of the number of vertices, it's the most big-O of N squared. So if M is at most M squared, then log M is at most two log N And the two is suppressed in the big-O notation. So log M and log M are interchangeable in this context. Notice that for the minimum cost spanning tree problem you may as well assume that there's no parallel edges. You may as well assume that the graphs are simple. If you have a bunch of parallel edges between a given vertices, you can just throw out all but the cheapest one. That's the only one you'll ever need. So, moving on to the main loop, pretty obvious how many iterations we have there. We have M iterations. So all we need to figure out is how much work we have to do in each iteration. So what is it each iteration needs to accomplish? It needs to check, whether or not adding the current edge to the edges we've already chosen creates a cycle or not. So I claim that can be done in time linear in the number of vertices. That is it can be done in the big-O of N time. So how do we accomplish this? Well, we need two quick observations. First of all, and this is something we've seen in arguments in the previous videos, checking whether or not this new edge is going to create a cycle. Say the edge has end points U and V. Checking for a cycle boils down to checking whether or not there's already a path between the end points U and V, and the edges capital T that was chosen so far. If there is a UV path already, adding this edge will close the loop and create a cycle. If there currently is no UV path, then adding this edge will not create a cycle. So the second observation is well, how do we check if there's a path from U to V, in the edges we've already chosen? Well, we already know how to do that just. Using graph search. So you can use breadth for a search, you can use depth for a search. It doesn't matter. You just start at the vertex U and you see if you have a reach of V or not. If you reach it, there's a path. If you don't reach it, there's not a path. Breadth-first-search, depth-first-search, whatever. It takes time linear, in the graph that you're searching. And since we only need to search for the edges that are in capital T. And there's going to be, at most, N minus one of them. Linear time in this context means O of N. O of the number of vertices, because that also bounds the number of edges that might be in capital T. The edges that we're searching for are pathing. So, adding up all of this work, what do we have? We have this sorting pre-processing step. That takes time big-O of M log N, then we have these M iterations of the four loop like this is takes O of N times factor. the latter term dominates, so the overall running time is big-O of M times N. So this, coincidentally, is the same running time we got from the straightforward implementation of Prim's algorithm, and I'll make the same comments here. This is a reasonable running time, it's polynomial on the input size. It's way better than checking all exponentially many spanning trees that the graph might have. But we certainly would like to do better. We'd certainly love to have implementation of Kruskal's algorithm that gets us to a near linear running time bound, and that's the plan. How are we going to do it? Well, really the work that we're doing here over and over again, which is kind of a bummer, is these cycle checks. And every single iteration, we're spending time linear in the number of vertices to check for a cycle. And the question is, can we speed that up? And the Union-Find data structure will actually, believe it or not, allow us to check for a cycle in constant time. So if we actually had a data structure that could implement constant time cycle checks. Then we'd have to spend only constant time for each iteration of this while loop. So the loop overall would take only linear time in the number of edges, O of M edges. If we got that, then believe it or not, the sorting pre-processing step would become the bottle neck in the running time of Kruskal's algorithm. Our running time would drop from N times N down to near linear, down O of N log N. So let me now tell you a little bit about this magical data structure that's going to give us constant time cycle checks. I'm just going to give you the high level picture, and how it connects to Kruskal's algorithm on this slide. We'll look at the details of the data structure, in the next video. I also want to warn you that I'm not going to discuss, in this pair of videos, the state of the art for Union-Find Data Structures. I'm going to give you a fairly primitive version, but that is nevertheless, sufficient to give us our desired M Log N running time of Kruskal's algorithm. So if you're interested, there is some optional material about different implementations of Union-Find that use some super cool ideas, like union by rank and path compression. That give you different, and in some senses, better operation times. But the quick and dirty version of Union-Find that I'm going to discuss here, is sufficient for our present needs. And so the Raison d'ĂȘtre of a Union-Find Data Structure is to maintain a partition of a set of objects. So in this picture, the overall rectangle is meant to denote a set of objects and then C1, C2, C3, and C4 are disjointed subsets that together form the union of the entire set. So that's what I mean by a partition of a group of objects. We're not going to ask too much of this data structure. We're only going to ask it to support two operations. No prizes to guess what those two operations are called. So the find operation, we give it an object from this universe and we ask the data structure to return to us the name of the group to which that object belongs. So for example, if we handed it something in the middle of this rectangle, an object, we'd expect it to return to us the name c3. The union operation by contrast, takes as input, the names of two groups. And what we want the data structure to do is to fuse those two groups together. That is, the objects in the first group, and the objects in the second group should all coalesce, and be, now, in one sole group. So why might such a data structure be useful for speeding up Kruskal's Algorithm? To see the connection, think of Kruskal's algorithm as working conceptually in the following way. So initially, when the algorithm starts in the Set capital T is empty, each of the vertices is by it's own, it's on its own isolated component. And then each time Kruskal's adds a new edge to the set capital T. What that does is it takes two current connected components and fuses them into a single connected component. So for example, toward the end of Kruskal's algorithm that maybe it's included enough edges, that now the Tree capital T that is constructed so far has only four different connected components. And maybe it's about to add a new edge U comma V, where of course U and V should be in different connected components with respect to the edges chosen for far. So this new edge addition at this iteration of Kruskal is going to fuse the connecting components of U and V into a single one. So that corresponds to taking the union of the groups to which U and V respectively belong. So to be a little more precise about it. So what are going to be the objects contained by the Union-Find Data Structure in Kruskal's algorithm? We're going to correspond to vertices. It's the vertices coalescing that we want to keep track of. So what are going to be the groups in the partition that we maintain? They're just going to correspond to connected components with respect to the edges that Kruskal's algorithm has already committed to. And with these semantics, it's clear that every time Kruskal adds a new edge to its set capital T, we have to invoke the union operation to fuse two connected components into one.