Okay so this video is not about any 
particular graph problem, not about any 
particular graph algorithm, just sort of 
the preliminaries we need to discuss 
algorithms on graphs. How do we measure 
their size, 
how do we represent them, 
and so on. 
Remember what a graph is? 
It really has two ingredients. 
First of all there's the set of objects 
we're talking about. 
Those might be called vertices. 
Synonymously we might call them nodes. 
We represent parallelized relationships 
using edges. 
These can be either undirected, in which 
case they're ordered pairs. 
Or an edge can be directed from one to 
another, in that case they're ordered 
pairs and we have a directed graph. 
Now, when we talk about, say, the size of 
a graph or the running time of an 
algorithm that operates on a graph, we 
need to think about what we mean by input 
size. 
In particular, for a graph, there's 
really two different parameters that 
control how big it is unlike an array. 
For arrays, we just had a single number, 
the length. 
For graphs, we have the number of 
vertices and we have the number of edges. 
Usually, we'll use the notation n for the 
number of vertices, m for the number of 
edges. 
So the next quiz asks you to think about 
how the number of edges m can depend on 
the number of vertices. 
End. 
So, in particular, I want you to think, 
for this quiz, about an undirected graph. 
It has n vertices. 
There's no parallel edges, okay? 
So for a given pair of vertices, there's 
either zero or one edge between them. 
Moreover, let's assume that the graph is 
unconnected. 
Okay? 
So I don't want you think about graphs 
that have zero edges. 
Now, I haven't defined what a graph is, 
what it means for a graph to be connected 
formally yet, 
but I hope you get the idea. 
It means it's in one piece. 
You can't break it into two parts, that 
have no edges crossing between them. 
So, for such a graph, no parallel edges 
in one piece, 
n vertices. 
Think about what is the minimum number of 
edges it could possibly have, 
and what is the maximum number of edges, 
as a function of N, that it could 
possible have. 
Alright. 
So the correct option is the first one. 
The fewest number of edges that the 
connected undirected graph can have is N 
minus one and the maximum number of edges 
that a undirected graph with no parallel 
edges can have is N times N minus one 
under two better known as N shoes two. 
So why does it need at least N minus one 
edges if its going to be in one piece, 
well think about adding the edges one at 
a time, 
at each of the edges of the graph. 
Now initially you just have a graph with 
zero edges, the graph has N different 
pieces, N isolated vertices and there's 
no edges at all. 
Now each time you add one edge, what you 
do is you take two of the existing pieces 
at best and fuse them into one. 
So the maximum decrease you can have, in 
the number of different pieces of a 
graph, is it can decrease by one each 
time you add an edge. So, from a starting 
point of N different pieces, you got to 
get down to one piece, so that requires 
the addition of N minus one edges. 
You can also convince yourself of this, 
by, by drawing a few pictures, and 
noticing the trees achieve this bound 
exactly. 
So for example, here is a four vertex 
tree that has three edges. 
So this is a case where M is, indeed, N 
minus 1. 
Now, for the upper bound, why can't you 
have more than N choose two? 
Well, it's clear that the largest number 
of edges you can have is for the complete 
graph, 
wherein every single pair of edges has 
one between them. 
Again, there's no parallel arcs and edges 
are unordered. 
So there's, at most, N choose two 
possibilities of where to put an edge. 
So again, if N4, equals 4, here, would be 
an example with a maximum possible 
number, 
six edges. 
So now that I've got you thinking about 
how the number of edges can vary with the 
number of vertices, 
let me talk about, the distinction 
between sparse and dense graphs. 
It's important to distinguish between 
these two concepts because some data 
structures and algorithms are better 
suited for sparse graphs. 
Other data structures and algorithms are 
better suited for dense graphs. 
So to make this precise, let me just, put 
down this very common notation. 
N is the number of vertices of the graph 
under discussion. 
M is the number of edges. 
This is quite standard notation. 
Please get used to it and use it 
yourself. 
If you reverse these, you will confuse a 
lot of people who have familiarity with 
graph algorithms and data structures. 
Now one thing we learned from the 
previous quiz is the following. 
So in most applications, not all 
applications, but most applications. 
. 
N is at least linear in N, remember in 
the quiz you saw, the least N minus one 
if you wanted to graph to be connected 
and it's also big O even squared. 
This is under the assumption that there's 
no parallel arcs. 
Now there are cases where we want to 
allow parallel arcs in fact, we'll do 
that in the contraction algorithm for the 
makeup problem. 
There are cases where we want to allow 
the number of edges to drop so low that 
the graph breaks into multiple pieces, 
for example, when we talk about connected 
components but more often then not, we're 
thinking about a connected graph with no 
parallel edges and then we can pin down 
the number of edges m to be somewhere 
between the linear and the number of 
nodes, 
linear in N and quadratic in N. 
Now I'm not going to give you a super 
formal definition of what a sparse or a 
dense graph is, and people are a little 
loose with this terminology in practice. 
But basically sparce means your closer to 
the lower bound closer to the linear 
dense means your closer to the upper 
bound, closer to quadratic. 
Now, I know this leaves ambiguity when 
the number of edges is something, you 
know, like N to the three halves. 
usually, in that case, you'd think of it 
as a dense graph. 
So usually anything which is more than N 
times some logarithmic terms, you'd think 
about as a dense graph. 
But again, people are a little bit sloppy 
with this, when they talk about graphs. 
Next, I want to discuss two 
representations of graphs. 
And we're mostly going be using the 
second one in this course. 
But this first one, the adjacency matrix, 
I do wann mention just briefly, 
just on this slide. 
This is a supernatural idea where you 
represent the edges in a graph using a 
matrix. 
Let me describe it first for undirected 
graphs. 
So the matrix is going to be denoted by 
capital a, and it's a square n by n 
matrix, where n is the number of vertices 
of the graph and the semantics are the I 
Jth entry of the matrix as one if and 
only if there's a edge between the 
vertices I and j in the graph. 
I'm assuming here that the vertices are 
named 1, 2, 3, 4, etc., all the way up to 
n. 
It's easy to add bells and whistles to 
the adjacency matrix to accommodate 
parallel edges, 
to accommodate edge weights, which is a, 
accommodate directed arcs, directed 
edges. 
If you wanted to have parallel arcs you 
could just have Aij denote the number of 
arcs that are between I and j. 
If edges have different weights, you can 
just have Aij be the weights of the ij 
edge. 
And for the directed graph you can use 
plus 1's and minus 1's. 
So if the arc is directed from I to J 
you'd set I, AIJ to be plus one. 
If the arc is directed from J to I, you'd 
set AIJ to minus one. 
There are many metrics by which you can 
evaluate a data structure or a 
representation. 
two important ones I want to discuss 
here. 
First all the number of resources it 
requires, and in this context that's the 
amount of space that the data structure 
needs. 
The second thing is, what are the 
operations of the data structure 
supports? 
So let's just begin with space 
requirements. 
What are they for the adjacency matrix? 
Alright. 
So, the answer, at least with the, sort 
of, straightforward way of storing a 
matrix is n2. 
squared, and this is independent of the 
number of edges. 
So you could try to beat this down for 
sparse graphs using sparse matrix tricks. 
But, for the basic idea of just actually 
representing an end by end matrix. 
You got n2 squared entries. 
You've got to store one bit in each, 
whether the edge is there or not. 
So that's going to give you n2 squared 
space. 
The constants are, of course, very small, 
because you're just storing one bit per 
entry. 
But nonetheless, this is quadratic in the 
number of vertices. 
Now that's going to be fine if you have a 
dense graph, if the number of edges is as 
high as, and squared, 
then you're not really wasting anything 
in this representation. 
But in a sparse graph, if M is much 
closer to linear, then this is a super 
wasteful representation. 
Let's talk about the adjacency list 
representation. 
This is the dominant one we'll be using 
in this class. 
This has several ingredients. 
So first you keep track of both the 
vertices and the edges as independent 
entities. 
So you're going to have an array or a 
list of each. 
And then we want these two arrays to 
cross-reference each other in the obvious 
way. 
So given a vertex you want to know which 
edges it's involved in. 
Given an edge, you want to know what its 
endpoints are. 
So lets say first most simply each edge 
is going to have two pointers, 
one for each of the two endpoints. 
In a directed graph, of course we'd keep 
track of which one is the head and which 
one is the tail. 
Now each vertex is going to point to all 
of the edges of which it's a member. 
Now in an undirected graph its clear what 
I mean by that. 
In a directed graph you could do it in a 
couple ways. 
Generally you'd have a vertex, keep track 
of all of the edges for which it is the 
tail. 
That is all of the edges which you can 
follow one hop out from the edge. 
If you wanted to, you can also have a 
second array of it more expensive 
storage, where the vertex also keeps 
track of the edges pointing to it, the 
edges for which it's the head. 
So let me ask you the same question I did 
with an adjacency matrix. 
What is the space required of an 
adjacency list as a function of the 
number of edges M and the number of 
vertices N of the graph? 
So the correct answer to this question is 
the third option. 
Feta of M plus N, 
which we're going to think of as linear 
space in the size of the graph. 
So this quiz is a little tricky. 
So to explain the answer let me return to 
the slide with the ingredients of 
adjancency lists, and let's compute the 
space for each of these four ingredients 
separately. 
Most of them are straightforward. 
For example, 
consider the first ingredient. 
This is just an array or a list of the N 
vertices. 
And we just need constant space per 
vertex to keep track of its existence. 
So this is going to be theta of N, 
linear in the number of vertices. 
Similarly, for the M edges, we just need 
linear space and the number of edges to 
remember their existence. 
So that's going to be feta m. 
Now each edge has to keep track of both 
of it's end points, so that's two 
pointers, but two is a constant for each 
of the m edges we have a constant space 
to keep track of m points, so that's 
going to give us another feta of m 
constant per edge. 
Now this fourth case you might be feeling 
kind of nervous because a vertex in 
principle could have edges involving all 
m minus one of the vertices, so the 
number of pointers of a single vertex 
could be theta of N. 
Also you could have, 
you know, you do have end verticies that 
could be theta of N squared and certainly 
in something like a complete graph you 
really would have that much. 
But the point is in sparse graphs N's K N 
squared is way overkill to the space 
needed by those fourth set of pointers. 
Actually if you think about it, 
for each pointer in the fourth category, 
a vertex pointing to a given edge, 
there's a pointer in the third category 
pointing in the opposite direction from 
that edge back to that vertex. 
So there's actually a one to one 
correspondence between pointers in the 
third category and pointers in the fourth 
category. 
Since the third category has space data 
of m, so does all of the pointers in the 
fourth category. 
So adding up over the four ingredients, 
we have one theta of N and three thetas 
of Ms', so that's going to give us 
overall, a theta of M plus N. 
If you'd prefer, another way you could 
think about this would be, theta of the 
max of M and N. These are the same up to 
a constant factor. 
Now as we discussed in a previous slide, 
often M is going to be bigger than N, 
but I wanted to do a generic analysis 
here, which applies even if the graph is 
not connected, even, even if it is in 
multiple pieces. 
So the space of the adjacency list is 
within a constant factor, the same as the 
number of ingredients in the graph, the 
number of vertices plus the number of 
edges. 
So in that sense, that's exactly what you 
want. 
Now being confronted with these two graph 
representations that I've shown you, I'm 
sure you're asking well one should you 
remember? 
Which one should you use? 
And the answer as it so often is, is it 
depends. 
It depends on two things. 
It depends on the density of your graph. 
It depends on how N compares to, N. 
And it also depends on what kind of 
operations that you support, 
want to support. 
Now given what we're covering in this 
class, an also. 
The motivating applications I have in 
mind. 
I can give you basically a clean answer 
to this question for the purposes of 
these five weeks, 
which is, we're going to be focusing on 
adjacency lists. 
The reason we're going to focus on 
adjacency lists in this class is both, is 
for both of these reasons. 
Both because of the operations we want, 
and both because of the graph density and 
motivating applications. 
So, first of all, most of the graph 
primitives, not all, but most will be 
dealing with graph search, and adjacency 
lists are perfect for doing graph search. 
You get a node, you follow an outgoing 
arc, you go to another node, you follow 
an outgoing arc, and so on. 
And so, adjacency lists are the perfect 
thing to do graph search. 
Adjacency matrices are definitely good 
for certain kinds of graph operations, 
but they're not things we're really going 
to be covering in this class. 
So, that's reason one. 
Reason two is, a lot of the motivation. 
For graph primitives these days comes 
from massive, massive. 
Networks. 
I mentioned early how the web can be 
fruitfully thought of as a directed wrath 
where the vertices are individually web 
pages and directed arcs corresponded to 
hyper links, going from the page with the 
hyper link pointing to one that the hyper 
link goes to. 
Now its hard to get an exact measurement 
of the web graft, but a conservative 
lower bound on the number of vertices is 
something like 10 billion. 
So that's 10 to the 10 Now that's pushing 
the limits of what computers can do, but 
it's within the limits. 
So if you work hard you can actually 
operate on graphs with 10 to the 10 
nodes. 
Now suppose we used an adjacency matrix 
representation so if N is 10 to the 10 
then N squared is going to be like 10 to 
the 20, and now we're getting close to 
the estimated number of atoms in the 
known universe. 
So that is clearly not feasible now and 
is not going to be feasible ever. 
So the adjacency matrix representation is 
totally out for 
huge sparse graphs like the web graph. 
Adjacency lists, 
well, the degree, on average, in the web 
is thought to be something like 10. 
So the number of edges is only going to 
be something like 10 to the 11, 
and then the adjacency list 
representation will be proportional to 
that. 
And again, that's really pushing what we 
can do with current technology, but it is 
within the limits. 
So using that representation, we can do 
non trivial computations on graphs, even 
at the scale of the web graph.