So in this set of lectures we'll be discussing the minimum cut problem in graphs and we'll be discussing the randomized contraction algorithm. A randomized algorithm is just so simple and elegant, it's almost impossible to believe that it could possibly work, but that's exactly what we'll be proving. So one way you can think about this set of lectures is as a segue of sorts between our discussion of randomization and our discussion of graphs. So we just finished talking about randomization in the context of sorting and searching. We'll pick it up again towards the end of the class when we discuss hashing, but while we're in the middle of randomization probability review, I want to give you another application of randomization in a totally different domain. In particular to the domain of graphs rather than to Sorting and searching. So that's one high-level goal of these lectures. A second one is we'll get our feet wet talking about graphs. And a lot of the next couple weeks, that's what we're gonna be talking about: fundamental graph primitives. So this will give us an excuse to start warming up with the vocabulary, some of the basic concepts of the graph, and what a graph algorithm looks like. Another perk, although it's not one of the main goals, but I wanna do, I do wanna point out this fact. Is that, at least, compared to most of the stuff that we're discussing in this class, this is a relatively recent algorithm, the contraction algorithm. By relatively recent, I mean, okay, it's twenty years old. Well at least that means most of us, I know not all of us, but most of us at least were born at the time that this algorithm was invented. And so just one quick digression. You know, in the intro course like this most of what we're gonna cover will be oldies but goodies, so from as much as 50 years ago. And while it's kind of amazing given how much the world and how much technology has changed over the past 50 years that idea in computer science from that long ago is still useful, they are. Okay. So it's just sort of an amazing thing about the stuff that the first generation of computer scientists figured out. Okay. It's still; it's still relevant to this day. That. Said, algorithm is still a vibrant field with lots of open questions and when I have an opportunity I'll try and give you glimpses of that fact. So I do wanna point out here that this is somewhat more reason algorithms that most of the other ones will see which dates back to the 90s. So let's talk about graphs. Fundamentally, what a graph does is represent pair wise relationships amongst a set of objects. So as such, a graph is going to have two ingredients. So first of all there's the objects that you're talking about and these have two very common names and you're just gonna have to know both of the names even though they're completely synonymous. The first name is vertices. [sound] So vertex is the singular, vertices is the plural. Also known interchangeably as nodes. I'll be using the notation, capital V for the set of, of vertices. So those are the objects. Now, we wanna represent pair wise relationships. So these pairs are gonna be called edges. It'll be noted by denuded by capital E and there's two flavors of graphs and both are really important, both come up all the time in application. So you should be aware of both kinds. So there is undirected graphs and directed graphs and that just depends on whether the edges themselves are undirected or directed. So edges can be undirected by which I mean this pair is unordered. There are just two verses of edge. The two n points say U and V and you don't distinguish one as the first and one as the second. Or edges can be directed, in which case you have a directed graph and here a pair is ordered. So you do have a notion of a first vertex or a first end point and the second vertex or second end point of an edge. Those are often called the tail and the head, respectively. And once in a while although I'll, I'll try to use, I'll not use this terminology, you hear directed edges called arcs. Now I think all of this is much clearer if I just draw some pictures indeed one used to call graphs dots and lines, the dots would refer to the vertices. So here's four dots or four vertices. And, the edges would be lines. So the way you denote one of these edges, is, you just draw a line between the two, end points of that edge, the two vertices that it corresponds to. So this is an undirected graph with four vertices and five edges. We could equally well have a directed version of this graph. So let's, still have four vertices and five edges. But to indicate that this is a directed graph, and that each edge has a first vertex and a second vertex, we're gonna add arrows to the lines. So the arrow points to the second vertex, or to the head of the, of the edge. So the first vertex is often called the [inaudible] of the edge. So, graphs are completely fundamental. They show up not just in computer science but in all kinds of different disciplines social sciences and, and biology being two prominent ones. So, let me just mention a couple of reasons you might use them just off the top of my head. But literally there's hundreds or thousands of others. So, a very literal example would be road networks. So, imagine you type in asking for driving directions from point A to point B and some web application or software or whatever computes a, a road, a route for you. What it's doing is it's manipulating some representation of a road network, which inevitably is going to be stored as a graph, where the vertices correspond to intersections, and the edges correspond to individual roads. The web Is often fruitfully thought of as a graph, a directed graph. So here the vertices are the individual web pages and edges correspond to hyperlinks. So the first vertex of an edge, the tail, is going to be the page that contains the hyperlink. The second vertex or the head of the edge is going to be what the hyperlink points to. So that's the web as a directed graph. Social networks are, quite naturally, represented as graphs. So here, the vertices correspond to the individuals. In the social network and the edges corresponds to relationships, say friendship links. I encourage to think about amongst the popular social networks these days, which ones are undirected graphs and which ones are directed graphs. We have some interesting examples of each of those. And often graphs are useful even when there isn't such an obvious network structure. So, just to mention one example let me just write down precedents constraints. So, to say what I mean you might think about you know, say you're a freshman in college and you're looking at your, your major is [inaudible] a computer science major. Now, you wanna know what courses to take in what order. And you could think the following graph, where each of the courses in your major corresponds to a vertex and you draw a directed edge from one, from course A to course B, if course A is a prerequisite for course B, that is, it has to be completed before you begin course B. 'Kay, so that's a way to represent dependencies. Sort of, a temporal ordering between pairs of, pairs of objects. Using a directed graph. So that's the basic language of graphs. Let me now talk about cuts in graphs, cuz again this set of lectures is gonna be about the so-called minimum cut problem. So the definition of a cut of a graph is very simple. It's just a grouping, a partition, of the vertices of the graph, into two groups, A and B. And both of those two groups should be non-empty. So to describe this in pictures, let me give you a cartoon of a cut in both the undirected and directed cases. So for an undirected graph, you can imagine drawing your two sets A and B. And once you've defined the two sets, A and B, the edges then fall into one of three categories. You've got edges with both of the end points in A. You've got edges with both of the end points in B. And then you've got edges with one end point in A, and one endpoint in B. So that's generically what the picture graph looks like viewed through the lens of a particular cut AB. The picture for directed graphs is similar, you would again have an A, and you'd again have a B, you have directed edges with both end points in A, directed edges with both end points in B. And now you have actually two further categories. So, you have edges who cross the cut from left to right, that is, whose tail vertex is in a, and whose head vertex in b. And you can also have edges which cross the cut in the opposite direction that is their tail is in b and their head is in a. Usually when we talk about cuts we're gonna be concerned with how many edges cross a given cut. And by that I mean the following the crossing edges of a cut a, b. Are those that satisfy the following property. So in the [inaudible] case it's exactly what you think it would be. One of the endpoints is in A, the other endpoint is in B, that's what it means to cross the cut. Now in the directed case there's a number of reasonable definitions you could propose about which edges cross the cut. Typically and in this course we're going to focus on the case were we only think about edges that cross the cut from the left to the right. And we ignore edges that cross from the right to the left. So that is the edges that cross the cut are those with tail in A and head in B. So referring to our two pictures, our two cartoons of cuts, for the undirected one, all three of these blue edges would be edges crossing the cut AB, cuz they're the ones that have one end point on the left side, and one end point on the right side. Now for the directed one, we only have two crossing edges. So the two that cross from left to right, with tail in A and head in B. The one that's crossing backward does not contribute. We don't count it as a crossing edge of the cut. So the next quiz is just a sanity check that you've absorbed the definition of a cut of a graph. Right, so the answer to this quiz is the third option. Recall what is the definition of a cut? It's just the way the group, the vertices into two sets A and B both should also be non-empty. So we have N vertices and essentially we have one binary degree of freedom for each. For each vertex we can decide whether or not it goes in set A or goes in set B. So two choices for each of the N vertices. That gives us a two to the N possible choices. Two to the N possible cuts overall. Now that's slightly incorrect because recall that a cut can't have a non-empty set A or a non-empty set B. So those are two of the two to the N options which were disallowed. So strictly speaking, the number is two to the end, minus two. But, two to the end is certainly the closest answer of the four provided. Now the minimum cut problem is exactly what you'd think it would be. I'd give you as input a graph, and amongst these exponentially many cuts, I want you to identify one for me with the fewest number of crossing edges. So a few quick comments so first of all the name for this cut is a min cut a min cut is one with the fewest number of crossing edges. Secondly to clarify I am going to allow in the input what is called parallel edges. There will be lots of applications where parallel edges are sort of pointless but for minimum cuts it's natural to allow parallel edges that means you have two edges that correspond Finally, the more seasoned programmers among you are probably wondering what I mean by, you're given a graph as inputs. You might be wondering about how, exactly, that's represented. So the next video's gonna discuss exactly that. The popular ways of representing graphs, and, and how we're gonna usually do it in this course, specifically via what's called an adjacency list. Okay so I want to make sure that everybody understands exactly what the minimum cut problem is asking, so let me draw for you a particular graph. With A vertices. And quite a few edges. And what I want you to answer is what is the min-cut value in this graph, that is how many edged cross the man on cut, the cut with the fewest number of crossing edges. Alright, so the correct answer is the second choice. The [inaudible] cut value is two, and the cut which shows that is just to break it, basically, in half, into the two halves. In this case, there are only two crossing edges. This one and this one. And I'll leave it for you to check that there's no other edge that has as few as two edges. Now in this case, we've got a very balanced split when we took the minimum cut. In general, that need not be true. Sometimes even a single vertex can define the minimum cut of a graph. And I encourage you to think about a concrete example that proves that. So why should you care about computing the minimum cut? Well this is one problem amongst a genre called graph partitioning where you're given a graph and you wanna break it into two or more pieces. And these kinds of graph partitioning problems come up all the time in a surprisingly diverse array of applications. So let me just mention a couple at a high level. So one very obvious one when you have a, when your graph is representing a physical network. What identifies something like a [inaudible] allows you to do is identify weaknesses in your network. Then perhaps it's your own network and you want to understand where you want to soup up the infrastructure. Because it's in some sense a hot spot of your network or a weak point. Or maybe there's someone else's network and you want to know where the weak spot is in their network. In fact there is some declassified documents about mm, about fifteen years ago or so which showed that the United States and Soviet Union Militaries, back during the Cold War, were actually quite interested in computing minimum cuts. Cuz they were looking for things like, for example, what's the most efficient way to disrupt the other Country's transportation network. Another application which is a big deal in social network analyst these days, is the idea of community detection. So the question is amongst a huge graph, like say the graph of everybody whose on Facebook or something like that, how can you identify small pockets of people that seem tightly knit. That seem closely related from which you like to infer that they're a community of some sort. You know maybe they all go to the same school, maybe they all have the same interest, maybe they're part of the same biological family Whatever. Now, it's, to some degree, still an open question, how to best define communities and social networks. But as a quick and dirty sort of first order heuristic, you can imagine looking for small regions, which, on the one hand, are highly interconnected amongst themselves. But quite weakly connected. To the rest of the graph, so subroutines like the minimum cut problem can be used for identifying these small densely interconnected, but then weakly connected to everybody else pockets of the graph. Finally cut problems are also used a lot in, in vision. So for example, one way you can use them is in what's called image segmentation. So here what's going on is your given as inputs two D array where each entry is a pixel from some image. And there's a graph which is very natural to define given a two D array of pixels, namely you have an edge between two pixels if their neighboring. So for two pixels that are immediately next to each other, left to right or top to bottom. You put an edge there. So that gives you, what's called a grid graph. And now unlike the basic minimum cut problem that we're talking about here. In image segmentation it's most natural to use edge weights, where the weight of an edge is basically how likely you expect those two pixels to be coming from a common object. Why not? You expect two neighboring pixels to come from the same object. Well, perhaps their color maps are almost exactly the same, and you just expect that they're part of the same thing. So once you've defined this grid graph with suitable edge weights, now you run a graph partitioning or mid-cut type sub-routine and the hope is that the cut that it identifies rips off one of the contiguous objects in the picture. And you do that a few times and you get the major objects in the given picture. So this list is by no means exhaustive of the applications of ming cut and graph partitioning sub routines, but I hope it serves as sufficient motivation to watch the rest of the lectures in this sequence.