So in this set of lectures we'll be
discussing the minimum cut problem in
graphs and we'll be discussing the
randomized contraction algorithm. A
randomized algorithm is just so simple and
elegant, it's almost impossible to believe
that it could possibly work, but that's
exactly what we'll be proving. So one way
you can think about this set of lectures
is as a segue of sorts between our
discussion of randomization and our
discussion of graphs. So we just finished
talking about randomization in the context
of sorting and searching. We'll pick it up
again towards the end of the class when we
discuss hashing, but while we're in the
middle of randomization probability
review, I want to give you another
application of randomization in a totally
different domain. In particular to the
domain of graphs rather than to Sorting
and searching. So that's one high-level
goal of these lectures. A second one is
we'll get our feet wet talking about
graphs. And a lot of the next couple
weeks, that's what we're gonna be talking
about: fundamental graph primitives. So
this will give us an excuse to start
warming up with the vocabulary, some of
the basic concepts of the graph, and what
a graph algorithm looks like. Another
perk, although it's not one of the main
goals, but I wanna do, I do wanna point
out this fact. Is that, at least, compared
to most of the stuff that we're discussing
in this class, this is a relatively recent
algorithm, the contraction algorithm. By
relatively recent, I mean, okay, it's
twenty years old. Well at least that means
most of us, I know not all of us, but most
of us at least were born at the time that
this algorithm was invented. And so just
one quick digression. You know, in the
intro course like this most of what we're
gonna cover will be oldies but goodies, so
from as much as 50 years ago. And while
it's kind of amazing given how much the
world and how much technology has changed
over the past 50 years that idea in
computer science from that long ago is
still useful, they are. Okay. So it's just
sort of an amazing thing about the stuff
that the first generation of computer
scientists figured out. Okay. It's still;
it's still relevant to this day. That.
Said, algorithm is still a vibrant field
with lots of open questions and when I
have an opportunity I'll try and give you
glimpses of that fact. So I do wanna point
out here that this is somewhat more reason
algorithms that most of the other ones
will see which dates back to the 90s. So
let's talk about graphs. Fundamentally,
what a graph does is represent pair wise
relationships amongst a set of objects. So
as such, a graph is going to have two
ingredients. So first of all there's the
objects that you're talking about and
these have two very common names and
you're just gonna have to know both of the
names even though they're completely
synonymous. The first name is vertices.
[sound] So vertex is the singular,
vertices is the plural. Also known
interchangeably as nodes. I'll be using
the notation, capital V for the set of, of
vertices. So those are the objects. Now,
we wanna represent pair wise
relationships. So these pairs are gonna be
called edges. It'll be noted by denuded by
capital E and there's two flavors of
graphs and both are really important, both
come up all the time in application. So
you should be aware of both kinds. So
there is undirected graphs and directed
graphs and that just depends on whether
the edges themselves are undirected or
directed. So edges can be undirected by
which I mean this pair is unordered. There
are just two verses of edge. The two n
points say U and V and you don't
distinguish one as the first and one as
the second. Or edges can be directed, in
which case you have a directed graph and
here a pair is ordered. So you do have a
notion of a first vertex or a first end
point and the second vertex or second end
point of an edge. Those are often called
the tail and the head, respectively. And
once in a while although I'll, I'll try to
use, I'll not use this terminology, you
hear directed edges called arcs. Now I
think all of this is much clearer if I
just draw some pictures indeed one used to
call graphs dots and lines, the dots would
refer to the vertices. So here's four dots
or four vertices. And, the edges would be
lines. So the way you denote one of these
edges, is, you just draw a line between
the two, end points of that edge, the two
vertices that it corresponds to. So this
is an undirected graph with four vertices
and five edges. We could equally well have
a directed version of this graph. So
let's, still have four vertices and five
edges. But to indicate that this is a
directed graph, and that each edge has a
first vertex and a second vertex, we're
gonna add arrows to the lines. So the
arrow points to the second vertex, or to
the head of the, of the edge. So the first
vertex is often called the [inaudible] of
the edge. So, graphs are completely
fundamental. They show up not just in
computer science but in all kinds of
different disciplines social sciences and,
and biology being two prominent ones. So,
let me just mention a couple of reasons
you might use them just off the top of my
head. But literally there's hundreds or
thousands of others. So, a very literal
example would be road networks. So,
imagine you type in asking for driving
directions from point A to point B and
some web application or software or
whatever computes a, a road, a route for
you. What it's doing is it's manipulating
some representation of a road network,
which inevitably is going to be stored as
a graph, where the vertices correspond to
intersections, and the edges correspond to
individual roads. The web Is often
fruitfully thought of as a graph, a
directed graph. So here the vertices are
the individual web pages and edges
correspond to hyperlinks. So the first
vertex of an edge, the tail, is going to
be the page that contains the hyperlink.
The second vertex or the head of the edge
is going to be what the hyperlink points
to. So that's the web as a directed graph.
Social networks are, quite naturally,
represented as graphs. So here, the
vertices correspond to the individuals. In
the social network and the edges
corresponds to relationships, say
friendship links. I encourage to think
about amongst the popular social networks
these days, which ones are undirected
graphs and which ones are directed graphs.
We have some interesting examples of each
of those. And often graphs are useful even
when there isn't such an obvious network
structure. So, just to mention one example
let me just write down precedents
constraints. So, to say what I mean you
might think about you know, say you're a
freshman in college and you're looking at
your, your major is [inaudible] a computer
science major. Now, you wanna know what
courses to take in what order. And you
could think the following graph, where
each of the courses in your major
corresponds to a vertex and you draw a
directed edge from one, from course A to
course B, if course A is a prerequisite
for course B, that is, it has to be
completed before you begin course B. 'Kay,
so that's a way to represent dependencies.
Sort of, a temporal ordering between pairs
of, pairs of objects. Using a directed
graph. So that's the basic language of
graphs. Let me now talk about cuts in
graphs, cuz again this set of lectures is
gonna be about the so-called minimum cut
problem. So the definition of a cut of a
graph is very simple. It's just a
grouping, a partition, of the vertices of
the graph, into two groups, A and B. And
both of those two groups should be
non-empty. So to describe this in
pictures, let me give you a cartoon of a
cut in both the undirected and directed
cases. So for an undirected graph, you can
imagine drawing your two sets A and B. And
once you've defined the two sets, A and B,
the edges then fall into one of three
categories. You've got edges with both of
the end points in A. You've got edges with
both of the end points in B. And then
you've got edges with one end point in A,
and one endpoint in B. So that's
generically what the picture graph looks
like viewed through the lens of a
particular cut AB. The picture for
directed graphs is similar, you would
again have an A, and you'd again have a B,
you have directed edges with both end
points in A, directed edges with both end
points in B. And now you have actually two
further categories. So, you have edges who
cross the cut from left to right, that is,
whose tail vertex is in a, and whose head
vertex in b. And you can also have edges
which cross the cut in the opposite
direction that is their tail is in b and
their head is in a. Usually when we talk
about cuts we're gonna be concerned with
how many edges cross a given cut. And by
that I mean the following the crossing
edges of a cut a, b. Are those that
satisfy the following property. So in the
[inaudible] case it's exactly what you
think it would be. One of the endpoints is
in A, the other endpoint is in B, that's
what it means to cross the cut. Now in the
directed case there's a number of
reasonable definitions you could propose
about which edges cross the cut. Typically
and in this course we're going to focus on
the case were we only think about edges
that cross the cut from the left to the
right. And we ignore edges that cross from
the right to the left. So that is the
edges that cross the cut are those with
tail in A and head in B. So referring to
our two pictures, our two cartoons of
cuts, for the undirected one, all three of
these blue edges would be edges crossing
the cut AB, cuz they're the ones that have
one end point on the left side, and one
end point on the right side. Now for the
directed one, we only have two crossing
edges. So the two that cross from left to
right, with tail in A and head in B. The
one that's crossing backward does not
contribute. We don't count it as a
crossing edge of the cut. So the next quiz
is just a sanity check that you've
absorbed the definition of a cut of a
graph. Right, so the answer to this quiz
is the third option. Recall what is the
definition of a cut? It's just the way the
group, the vertices into two sets A and B
both should also be non-empty. So we have
N vertices and essentially we have one
binary degree of freedom for each. For
each vertex we can decide whether or not
it goes in set A or goes in set B. So two
choices for each of the N vertices. That
gives us a two to the N possible choices.
Two to the N possible cuts overall. Now
that's slightly incorrect because recall
that a cut can't have a non-empty set A or
a non-empty set B. So those are two of the
two to the N options which were
disallowed. So strictly speaking, the
number is two to the end, minus two. But,
two to the end is certainly the closest
answer of the four provided. Now the
minimum cut problem is exactly what you'd
think it would be. I'd give you as input a
graph, and amongst these exponentially
many cuts, I want you to identify one for
me with the fewest number of crossing
edges. So a few quick comments so first of
all the name for this cut is a min cut a
min cut is one with the fewest number of
crossing edges. Secondly to clarify I am
going to allow in the input what is called
parallel edges. There will be lots of
applications where parallel edges are sort
of pointless but for minimum cuts it's
natural to allow parallel edges that means
you have two edges that correspond
Finally, the more seasoned programmers
among you are probably wondering what I
mean by, you're given a graph as inputs.
You might be wondering about how, exactly,
that's represented. So the next video's
gonna discuss exactly that. The popular
ways of representing graphs, and, and how
we're gonna usually do it in this course,
specifically via what's called an
adjacency list. Okay so I want to make
sure that everybody understands exactly
what the minimum cut problem is asking, so
let me draw for you a particular graph.
With A vertices. And quite a few edges.
And what I want you to answer is what is
the min-cut value in this graph, that is
how many edged cross the man on cut, the
cut with the fewest number of crossing
edges. Alright, so the correct answer is
the second choice. The [inaudible] cut
value is two, and the cut which shows that
is just to break it, basically, in half,
into the two halves. In this case, there
are only two crossing edges. This one and
this one. And I'll leave it for you to
check that there's no other edge that has
as few as two edges. Now in this case,
we've got a very balanced split when we
took the minimum cut. In general, that
need not be true. Sometimes even a single
vertex can define the minimum cut of a
graph. And I encourage you to think about
a concrete example that proves that. So
why should you care about computing the
minimum cut? Well this is one problem
amongst a genre called graph partitioning
where you're given a graph and you wanna
break it into two or more pieces. And
these kinds of graph partitioning problems
come up all the time in a surprisingly
diverse array of applications. So let me
just mention a couple at a high level. So
one very obvious one when you have a, when
your graph is representing a physical
network. What identifies something like a
[inaudible] allows you to do is identify
weaknesses in your network. Then perhaps
it's your own network and you want to
understand where you want to soup up the
infrastructure. Because it's in some sense
a hot spot of your network or a weak
point. Or maybe there's someone else's
network and you want to know where the
weak spot is in their network. In fact
there is some declassified documents about
mm, about fifteen years ago or so which
showed that the United States and Soviet
Union Militaries, back during the Cold
War, were actually quite interested in
computing minimum cuts. Cuz they were
looking for things like, for example,
what's the most efficient way to disrupt
the other Country's transportation
network. Another application which is a
big deal in social network analyst these
days, is the idea of community detection.
So the question is amongst a huge graph,
like say the graph of everybody whose on
Facebook or something like that, how can
you identify small pockets of people that
seem tightly knit. That seem closely
related from which you like to infer that
they're a community of some sort. You know
maybe they all go to the same school,
maybe they all have the same interest,
maybe they're part of the same biological
family Whatever. Now, it's, to some
degree, still an open question, how to
best define communities and social
networks. But as a quick and dirty sort of
first order heuristic, you can imagine
looking for small regions, which, on the
one hand, are highly interconnected
amongst themselves. But quite weakly
connected. To the rest of the graph, so
subroutines like the minimum cut problem
can be used for identifying these small
densely interconnected, but then weakly
connected to everybody else pockets of the
graph. Finally cut problems are also used
a lot in, in vision. So for example, one
way you can use them is in what's called
image segmentation. So here what's going
on is your given as inputs two D array
where each entry is a pixel from some
image. And there's a graph which is very
natural to define given a two D array of
pixels, namely you have an edge between
two pixels if their neighboring. So for
two pixels that are immediately next to
each other, left to right or top to
bottom. You put an edge there. So that
gives you, what's called a grid graph. And
now unlike the basic minimum cut problem
that we're talking about here. In image
segmentation it's most natural to use edge
weights, where the weight of an edge is
basically how likely you expect those two
pixels to be coming from a common object.
Why not? You expect two neighboring pixels
to come from the same object. Well,
perhaps their color maps are almost
exactly the same, and you just expect that
they're part of the same thing. So once
you've defined this grid graph with
suitable edge weights, now you run a graph
partitioning or mid-cut type sub-routine
and the hope is that the cut that it
identifies rips off one of the contiguous
objects in the picture. And you do that a
few times and you get the major objects in
the given picture. So this list is by no
means exhaustive of the applications of
ming cut and graph partitioning sub
routines, but I hope it serves as
sufficient motivation to watch the rest of
the lectures in this sequence.