1
00:00:00,000 --> 00:00:04,002
Okay so this video is not about any 
particular graph problem, not about any 

2
00:00:04,002 --> 00:00:08,111
particular graph algorithm, just sort of 
the preliminaries we need to discuss 

3
00:00:08,111 --> 00:00:10,780
algorithms on graphs. How do we measure 
their size, 

4
00:00:10,780 --> 00:00:12,512
how do we represent them, 
and so on. 

5
00:00:12,512 --> 00:00:15,218
Remember what a graph is? 
It really has two ingredients. 

6
00:00:15,218 --> 00:00:18,170
First of all there's the set of objects 
we're talking about. 

7
00:00:18,170 --> 00:00:21,565
Those might be called vertices. 
Synonymously we might call them nodes. 

8
00:00:21,565 --> 00:00:24,124
We represent parallelized relationships 
using edges. 

9
00:00:24,124 --> 00:00:27,420
These can be either undirected, in which 
case they're ordered pairs. 

10
00:00:27,420 --> 00:00:31,160
Or an edge can be directed from one to 
another, in that case they're ordered 

11
00:00:31,160 --> 00:00:34,801
pairs and we have a directed graph. 
Now, when we talk about, say, the size of 

12
00:00:34,801 --> 00:00:38,344
a graph or the running time of an 
algorithm that operates on a graph, we 

13
00:00:38,344 --> 00:00:40,656
need to think about what we mean by input 
size. 

14
00:00:40,656 --> 00:00:44,150
In particular, for a graph, there's 
really two different parameters that 

15
00:00:44,150 --> 00:00:48,221
control how big it is unlike an array. 
For arrays, we just had a single number, 

16
00:00:48,221 --> 00:00:50,644
the length. 
For graphs, we have the number of 

17
00:00:50,644 --> 00:00:55,157
vertices and we have the number of edges. 
Usually, we'll use the notation n for the 

18
00:00:55,157 --> 00:00:57,635
number of vertices, m for the number of 
edges. 

19
00:00:57,635 --> 00:01:02,093
So the next quiz asks you to think about 
how the number of edges m can depend on 

20
00:01:02,093 --> 00:01:04,070
the number of vertices. 
End. 

21
00:01:04,070 --> 00:01:08,084
So, in particular, I want you to think, 
for this quiz, about an undirected graph. 

22
00:01:08,084 --> 00:01:10,639
It has n vertices. 
There's no parallel edges, okay? 

23
00:01:10,639 --> 00:01:14,654
So for a given pair of vertices, there's 
either zero or one edge between them. 

24
00:01:14,654 --> 00:01:17,365
Moreover, let's assume that the graph is 
unconnected. 

25
00:01:17,365 --> 00:01:19,659
Okay? 
So I don't want you think about graphs 

26
00:01:19,659 --> 00:01:22,735
that have zero edges. 
Now, I haven't defined what a graph is, 

27
00:01:22,735 --> 00:01:25,603
what it means for a graph to be connected 
formally yet, 

28
00:01:25,603 --> 00:01:28,470
but I hope you get the idea. 
It means it's in one piece. 

29
00:01:28,470 --> 00:01:32,537
You can't break it into two parts, that 
have no edges crossing between them. 

30
00:01:32,537 --> 00:01:35,196
So, for such a graph, no parallel edges 
in one piece, 

31
00:01:35,196 --> 00:01:38,136
n vertices. 
Think about what is the minimum number of 

32
00:01:38,136 --> 00:01:42,395
edges it could possibly have, 
and what is the maximum number of edges, 

33
00:01:42,395 --> 00:01:46,260
as a function of N, that it could 
possible have. 

34
00:01:46,260 --> 00:01:48,980
Alright. 
So the correct option is the first one. 

35
00:01:48,980 --> 00:01:53,008
The fewest number of edges that the 
connected undirected graph can have is N 

36
00:01:53,008 --> 00:01:57,351
minus one and the maximum number of edges 
that a undirected graph with no parallel 

37
00:01:57,351 --> 00:02:01,328
edges can have is N times N minus one 
under two better known as N shoes two. 

38
00:02:01,328 --> 00:02:05,461
So why does it need at least N minus one 
edges if its going to be in one piece, 

39
00:02:05,461 --> 00:02:08,130
well think about adding the edges one at 
a time, 

40
00:02:08,130 --> 00:02:12,355
at each of the edges of the graph. 
Now initially you just have a graph with 

41
00:02:12,355 --> 00:02:16,637
zero edges, the graph has N different 
pieces, N isolated vertices and there's 

42
00:02:16,637 --> 00:02:19,792
no edges at all. 
Now each time you add one edge, what you 

43
00:02:19,792 --> 00:02:23,905
do is you take two of the existing pieces 
at best and fuse them into one. 

44
00:02:23,905 --> 00:02:28,187
So the maximum decrease you can have, in 
the number of different pieces of a 

45
00:02:28,187 --> 00:02:32,537
graph, is it can decrease by one each 
time you add an edge. So, from a starting 

46
00:02:32,537 --> 00:02:36,952
point of N different pieces, you got to 
get down to one piece, so that requires 

47
00:02:36,952 --> 00:02:41,084
the addition of N minus one edges. 
You can also convince yourself of this, 

48
00:02:41,084 --> 00:02:45,103
by, by drawing a few pictures, and 
noticing the trees achieve this bound 

49
00:02:45,103 --> 00:02:47,706
exactly. 
So for example, here is a four vertex 

50
00:02:47,706 --> 00:02:51,201
tree that has three edges. 
So this is a case where M is, indeed, N 

51
00:02:51,201 --> 00:02:53,568
minus 1. 
Now, for the upper bound, why can't you 

52
00:02:53,568 --> 00:02:57,400
have more than N choose two? 
Well, it's clear that the largest number 

53
00:02:57,400 --> 00:03:00,105
of edges you can have is for the complete 
graph, 

54
00:03:00,105 --> 00:03:03,148
wherein every single pair of edges has 
one between them. 

55
00:03:03,148 --> 00:03:06,247
Again, there's no parallel arcs and edges 
are unordered. 

56
00:03:06,247 --> 00:03:10,191
So there's, at most, N choose two 
possibilities of where to put an edge. 

57
00:03:10,191 --> 00:03:13,742
So again, if N4, equals 4, here, would be 
an example with a maximum possible 

58
00:03:13,742 --> 00:03:15,700
number, 
six edges. 

59
00:03:15,700 --> 00:03:20,421
So now that I've got you thinking about 
how the number of edges can vary with the 

60
00:03:20,421 --> 00:03:23,645
number of vertices, 
let me talk about, the distinction 

61
00:03:23,645 --> 00:03:27,675
between sparse and dense graphs. 
It's important to distinguish between 

62
00:03:27,675 --> 00:03:31,936
these two concepts because some data 
structures and algorithms are better 

63
00:03:31,936 --> 00:03:35,735
suited for sparse graphs. 
Other data structures and algorithms are 

64
00:03:35,735 --> 00:03:39,996
better suited for dense graphs. 
So to make this precise, let me just, put 

65
00:03:39,996 --> 00:03:44,141
down this very common notation. 
N is the number of vertices of the graph 

66
00:03:44,141 --> 00:03:46,560
under discussion. 
M is the number of edges. 

67
00:03:49,080 --> 00:03:53,260
This is quite standard notation. 
Please get used to it and use it 

68
00:03:53,260 --> 00:03:56,411
yourself. 
If you reverse these, you will confuse a 

69
00:03:56,411 --> 00:04:01,556
lot of people who have familiarity with 
graph algorithms and data structures. 

70
00:04:01,556 --> 00:04:05,736
Now one thing we learned from the 
previous quiz is the following. 

71
00:04:05,736 --> 00:04:10,045
So in most applications, not all 
applications, but most applications. 

72
00:04:10,045 --> 00:04:13,188
. 
N is at least linear in N, remember in 

73
00:04:13,188 --> 00:04:17,967
the quiz you saw, the least N minus one 
if you wanted to graph to be connected 

74
00:04:17,967 --> 00:04:21,995
and it's also big O even squared. 
This is under the assumption that there's 

75
00:04:21,995 --> 00:04:24,575
no parallel arcs. 
Now there are cases where we want to 

76
00:04:24,575 --> 00:04:28,348
allow parallel arcs in fact, we'll do 
that in the contraction algorithm for the 

77
00:04:28,348 --> 00:04:30,927
makeup problem. 
There are cases where we want to allow 

78
00:04:30,927 --> 00:04:34,652
the number of edges to drop so low that 
the graph breaks into multiple pieces, 

79
00:04:34,652 --> 00:04:38,568
for example, when we talk about connected 
components but more often then not, we're 

80
00:04:38,568 --> 00:04:42,437
thinking about a connected graph with no 
parallel edges and then we can pin down 

81
00:04:42,437 --> 00:04:46,019
the number of edges m to be somewhere 
between the linear and the number of 

82
00:04:46,019 --> 00:04:47,760
nodes, 
linear in N and quadratic in N. 

83
00:04:47,760 --> 00:04:51,918
Now I'm not going to give you a super 
formal definition of what a sparse or a 

84
00:04:51,918 --> 00:04:56,289
dense graph is, and people are a little 
loose with this terminology in practice. 

85
00:04:56,289 --> 00:05:00,501
But basically sparce means your closer to 
the lower bound closer to the linear 

86
00:05:00,501 --> 00:05:04,820
dense means your closer to the upper 
bound, closer to quadratic. 

87
00:05:04,820 --> 00:05:08,529
Now, I know this leaves ambiguity when 
the number of edges is something, you 

88
00:05:08,529 --> 00:05:12,188
know, like N to the three halves. 
usually, in that case, you'd think of it 

89
00:05:12,188 --> 00:05:15,056
as a dense graph. 
So usually anything which is more than N 

90
00:05:15,056 --> 00:05:17,974
times some logarithmic terms, you'd think 
about as a dense graph. 

91
00:05:17,974 --> 00:05:22,030
But again, people are a little bit sloppy 
with this, when they talk about graphs. 

92
00:05:22,030 --> 00:05:24,977
Next, I want to discuss two 
representations of graphs. 

93
00:05:24,977 --> 00:05:28,734
And we're mostly going be using the 
second one in this course. 

94
00:05:28,734 --> 00:05:32,896
But this first one, the adjacency matrix, 
I do wann mention just briefly, 

95
00:05:32,896 --> 00:05:36,191
just on this slide. 
This is a supernatural idea where you 

96
00:05:36,191 --> 00:05:38,850
represent the edges in a graph using a 
matrix. 

97
00:05:38,850 --> 00:05:42,004
Let me describe it first for undirected 
graphs. 

98
00:05:42,004 --> 00:05:46,837
So the matrix is going to be denoted by 
capital a, and it's a square n by n 

99
00:05:46,837 --> 00:05:52,341
matrix, where n is the number of vertices 
of the graph and the semantics are the I 

100
00:05:52,341 --> 00:05:57,308
Jth entry of the matrix as one if and 
only if there's a edge between the 

101
00:05:57,308 --> 00:06:02,006
vertices I and j in the graph. 
I'm assuming here that the vertices are 

102
00:06:02,006 --> 00:06:04,960
named 1, 2, 3, 4, etc., all the way up to 
n. 

103
00:06:04,960 --> 00:06:09,085
It's easy to add bells and whistles to 
the adjacency matrix to accommodate 

104
00:06:09,085 --> 00:06:12,055
parallel edges, 
to accommodate edge weights, which is a, 

105
00:06:12,055 --> 00:06:14,310
accommodate directed arcs, directed 
edges. 

106
00:06:14,310 --> 00:06:19,368
If you wanted to have parallel arcs you 
could just have Aij denote the number of 

107
00:06:19,368 --> 00:06:24,332
arcs that are between I and j. 
If edges have different weights, you can 

108
00:06:24,332 --> 00:06:28,000
just have Aij be the weights of the ij 
edge. 

109
00:06:28,000 --> 00:06:31,906
And for the directed graph you can use 
plus 1's and minus 1's. 

110
00:06:31,906 --> 00:06:36,253
So if the arc is directed from I to J 
you'd set I, AIJ to be plus one. 

111
00:06:36,253 --> 00:06:40,160
If the arc is directed from J to I, you'd 
set AIJ to minus one. 

112
00:06:40,160 --> 00:06:43,939
There are many metrics by which you can 
evaluate a data structure or a 

113
00:06:43,939 --> 00:06:46,761
representation. 
two important ones I want to discuss 

114
00:06:46,761 --> 00:06:49,210
here. 
First all the number of resources it 

115
00:06:49,210 --> 00:06:53,469
requires, and in this context that's the 
amount of space that the data structure 

116
00:06:53,469 --> 00:06:55,545
needs. 
The second thing is, what are the 

117
00:06:55,545 --> 00:06:57,781
operations of the data structure 
supports? 

118
00:06:57,781 --> 00:07:00,123
So let's just begin with space 
requirements. 

119
00:07:00,123 --> 00:07:02,633
What are they for the adjacency matrix? 
Alright. 

120
00:07:02,633 --> 00:07:06,593
So, the answer, at least with the, sort 
of, straightforward way of storing a 

121
00:07:06,593 --> 00:07:08,925
matrix is n2. 
squared, and this is independent of the 

122
00:07:08,925 --> 00:07:11,908
number of edges. 
So you could try to beat this down for 

123
00:07:11,908 --> 00:07:16,301
sparse graphs using sparse matrix tricks. 
But, for the basic idea of just actually 

124
00:07:16,301 --> 00:07:19,122
representing an end by end matrix. 
You got n2 squared entries. 

125
00:07:19,122 --> 00:07:22,647
You've got to store one bit in each, 
whether the edge is there or not. 

126
00:07:22,647 --> 00:07:24,437
So that's going to give you n2 squared 
space. 

127
00:07:24,437 --> 00:07:28,613
The constants are, of course, very small, 
because you're just storing one bit per 

128
00:07:28,613 --> 00:07:31,162
entry. 
But nonetheless, this is quadratic in the 

129
00:07:31,162 --> 00:07:34,326
number of vertices. 
Now that's going to be fine if you have a 

130
00:07:34,326 --> 00:07:38,200
dense graph, if the number of edges is as 
high as, and squared, 

131
00:07:38,200 --> 00:07:41,605
then you're not really wasting anything 
in this representation. 

132
00:07:41,605 --> 00:07:45,552
But in a sparse graph, if M is much 
closer to linear, then this is a super 

133
00:07:45,552 --> 00:07:49,114
wasteful representation. 
Let's talk about the adjacency list 

134
00:07:49,114 --> 00:07:52,574
representation. 
This is the dominant one we'll be using 

135
00:07:52,574 --> 00:07:55,280
in this class. 
This has several ingredients. 

136
00:07:55,280 --> 00:07:59,334
So first you keep track of both the 
vertices and the edges as independent 

137
00:07:59,334 --> 00:08:01,745
entities. 
So you're going to have an array or a 

138
00:08:01,745 --> 00:08:04,485
list of each. 
And then we want these two arrays to 

139
00:08:04,485 --> 00:08:07,005
cross-reference each other in the obvious 
way. 

140
00:08:07,005 --> 00:08:10,402
So given a vertex you want to know which 
edges it's involved in. 

141
00:08:10,402 --> 00:08:13,251
Given an edge, you want to know what its 
endpoints are. 

142
00:08:13,251 --> 00:08:17,087
So lets say first most simply each edge 
is going to have two pointers, 

143
00:08:17,087 --> 00:08:21,086
one for each of the two endpoints. 
In a directed graph, of course we'd keep 

144
00:08:21,086 --> 00:08:24,210
track of which one is the head and which 
one is the tail. 

145
00:08:24,210 --> 00:08:28,210
Now each vertex is going to point to all 
of the edges of which it's a member. 

146
00:08:28,210 --> 00:08:31,171
Now in an undirected graph its clear what 
I mean by that. 

147
00:08:31,171 --> 00:08:33,924
In a directed graph you could do it in a 
couple ways. 

148
00:08:33,924 --> 00:08:38,132
Generally you'd have a vertex, keep track 
of all of the edges for which it is the 

149
00:08:38,132 --> 00:08:40,418
tail. 
That is all of the edges which you can 

150
00:08:40,418 --> 00:08:44,210
follow one hop out from the edge. 
If you wanted to, you can also have a 

151
00:08:44,210 --> 00:08:47,847
second array of it more expensive 
storage, where the vertex also keeps 

152
00:08:47,847 --> 00:08:51,900
track of the edges pointing to it, the 
edges for which it's the head. 

153
00:08:51,900 --> 00:08:55,598
So let me ask you the same question I did 
with an adjacency matrix. 

154
00:08:55,598 --> 00:08:59,407
What is the space required of an 
adjacency list as a function of the 

155
00:08:59,407 --> 00:09:02,720
number of edges M and the number of 
vertices N of the graph? 

156
00:09:04,280 --> 00:09:07,878
So the correct answer to this question is 
the third option. 

157
00:09:07,878 --> 00:09:11,172
Feta of M plus N, 
which we're going to think of as linear 

158
00:09:11,172 --> 00:09:15,015
space in the size of the graph. 
So this quiz is a little tricky. 

159
00:09:15,015 --> 00:09:19,895
So to explain the answer let me return to 
the slide with the ingredients of 

160
00:09:19,895 --> 00:09:24,714
adjancency lists, and let's compute the 
space for each of these four ingredients 

161
00:09:24,714 --> 00:09:27,398
separately. 
Most of them are straightforward. 

162
00:09:27,398 --> 00:09:29,791
For example, 
consider the first ingredient. 

163
00:09:29,791 --> 00:09:32,559
This is just an array or a list of the N 
vertices. 

164
00:09:32,559 --> 00:09:36,657
And we just need constant space per 
vertex to keep track of its existence. 

165
00:09:36,657 --> 00:09:40,201
So this is going to be theta of N, 
linear in the number of vertices. 

166
00:09:40,201 --> 00:09:44,575
Similarly, for the M edges, we just need 
linear space and the number of edges to 

167
00:09:44,575 --> 00:09:47,622
remember their existence. 
So that's going to be feta m. 

168
00:09:47,622 --> 00:09:52,292
Now each edge has to keep track of both 
of it's end points, so that's two 

169
00:09:52,292 --> 00:09:57,408
pointers, but two is a constant for each 
of the m edges we have a constant space 

170
00:09:57,408 --> 00:10:01,758
to keep track of m points, so that's 
going to give us another feta of m 

171
00:10:01,758 --> 00:10:05,595
constant per edge. 
Now this fourth case you might be feeling 

172
00:10:05,595 --> 00:10:10,712
kind of nervous because a vertex in 
principle could have edges involving all 

173
00:10:10,712 --> 00:10:15,381
m minus one of the vertices, so the 
number of pointers of a single vertex 

174
00:10:15,381 --> 00:10:17,940
could be theta of N. 
Also you could have, 

175
00:10:17,940 --> 00:10:21,940
you know, you do have end verticies that 
could be theta of N squared and certainly 

176
00:10:21,940 --> 00:10:25,250
in something like a complete graph you 
really would have that much. 

177
00:10:25,250 --> 00:10:29,102
But the point is in sparse graphs N's K N 
squared is way overkill to the space 

178
00:10:29,102 --> 00:10:32,560
needed by those fourth set of pointers. 
Actually if you think about it, 

179
00:10:32,560 --> 00:10:37,227
for each pointer in the fourth category, 
a vertex pointing to a given edge, 

180
00:10:37,227 --> 00:10:42,273
there's a pointer in the third category 
pointing in the opposite direction from 

181
00:10:42,273 --> 00:10:46,247
that edge back to that vertex. 
So there's actually a one to one 

182
00:10:46,247 --> 00:10:51,356
correspondence between pointers in the 
third category and pointers in the fourth 

183
00:10:51,356 --> 00:10:54,447
category. 
Since the third category has space data 

184
00:10:54,447 --> 00:10:58,600
of m, so does all of the pointers in the 
fourth category. 

185
00:10:58,600 --> 00:11:03,857
So adding up over the four ingredients, 
we have one theta of N and three thetas 

186
00:11:03,857 --> 00:11:07,982
of Ms', so that's going to give us 
overall, a theta of M plus N. 

187
00:11:07,982 --> 00:11:13,106
If you'd prefer, another way you could 
think about this would be, theta of the 

188
00:11:13,106 --> 00:11:17,180
max of M and N. These are the same up to 
a constant factor. 

189
00:11:17,180 --> 00:11:21,301
Now as we discussed in a previous slide, 
often M is going to be bigger than N, 

190
00:11:21,301 --> 00:11:25,646
but I wanted to do a generic analysis 
here, which applies even if the graph is 

191
00:11:25,646 --> 00:11:28,542
not connected, even, even if it is in 
multiple pieces. 

192
00:11:28,542 --> 00:11:32,664
So the space of the adjacency list is 
within a constant factor, the same as the 

193
00:11:32,664 --> 00:11:36,953
number of ingredients in the graph, the 
number of vertices plus the number of 

194
00:11:36,953 --> 00:11:39,571
edges. 
So in that sense, that's exactly what you 

195
00:11:39,571 --> 00:11:42,441
want. 
Now being confronted with these two graph 

196
00:11:42,441 --> 00:11:47,314
representations that I've shown you, I'm 
sure you're asking well one should you 

197
00:11:47,314 --> 00:11:49,412
remember? 
Which one should you use? 

198
00:11:49,412 --> 00:11:52,311
And the answer as it so often is, is it 
depends. 

199
00:11:52,311 --> 00:11:56,321
It depends on two things. 
It depends on the density of your graph. 

200
00:11:56,321 --> 00:12:00,825
It depends on how N compares to, N. 
And it also depends on what kind of 

201
00:12:00,825 --> 00:12:03,540
operations that you support, 
want to support. 

202
00:12:03,540 --> 00:12:06,770
Now given what we're covering in this 
class, an also. 

203
00:12:06,770 --> 00:12:09,442
The motivating applications I have in 
mind. 

204
00:12:09,442 --> 00:12:14,412
I can give you basically a clean answer 
to this question for the purposes of 

205
00:12:14,412 --> 00:12:17,705
these five weeks, 
which is, we're going to be focusing on 

206
00:12:17,705 --> 00:12:20,638
adjacency lists. 
The reason we're going to focus on 

207
00:12:20,638 --> 00:12:24,351
adjacency lists in this class is both, is 
for both of these reasons. 

208
00:12:24,351 --> 00:12:28,294
Both because of the operations we want, 
and both because of the graph density and 

209
00:12:28,294 --> 00:12:31,117
motivating applications. 
So, first of all, most of the graph 

210
00:12:31,117 --> 00:12:34,817
primitives, not all, but most will be 
dealing with graph search, and adjacency 

211
00:12:34,817 --> 00:12:38,662
lists are perfect for doing graph search. 
You get a node, you follow an outgoing 

212
00:12:38,662 --> 00:12:41,778
arc, you go to another node, you follow 
an outgoing arc, and so on. 

213
00:12:41,778 --> 00:12:44,893
And so, adjacency lists are the perfect 
thing to do graph search. 

214
00:12:44,893 --> 00:12:48,641
Adjacency matrices are definitely good 
for certain kinds of graph operations, 

215
00:12:48,641 --> 00:12:52,098
but they're not things we're really going 
to be covering in this class. 

216
00:12:52,098 --> 00:12:54,970
So, that's reason one. 
Reason two is, a lot of the motivation. 

217
00:12:54,970 --> 00:12:58,970
For graph primitives these days comes 
from massive, massive. 

218
00:12:58,970 --> 00:13:01,716
Networks. 
I mentioned early how the web can be 

219
00:13:01,716 --> 00:13:06,613
fruitfully thought of as a directed wrath 
where the vertices are individually web 

220
00:13:06,613 --> 00:13:11,390
pages and directed arcs corresponded to 
hyper links, going from the page with the 

221
00:13:11,390 --> 00:13:14,675
hyper link pointing to one that the hyper 
link goes to. 

222
00:13:14,675 --> 00:13:19,273
Now its hard to get an exact measurement 
of the web graft, but a conservative 

223
00:13:19,273 --> 00:13:22,978
lower bound on the number of vertices is 
something like 10 billion. 

224
00:13:22,978 --> 00:13:27,461
So that's 10 to the 10 Now that's pushing 
the limits of what computers can do, but 

225
00:13:27,461 --> 00:13:30,663
it's within the limits. 
So if you work hard you can actually 

226
00:13:30,663 --> 00:13:33,328
operate on graphs with 10 to the 10 
nodes. 

227
00:13:33,328 --> 00:13:38,363
Now suppose we used an adjacency matrix 
representation so if N is 10 to the 10 

228
00:13:38,363 --> 00:13:43,586
then N squared is going to be like 10 to 
the 20, and now we're getting close to 

229
00:13:43,586 --> 00:13:46,819
the estimated number of atoms in the 
known universe. 

230
00:13:46,819 --> 00:13:51,357
So that is clearly not feasible now and 
is not going to be feasible ever. 

231
00:13:51,357 --> 00:13:55,150
So the adjacency matrix representation is 
totally out for 

232
00:13:55,150 --> 00:13:58,157
huge sparse graphs like the web graph. 
Adjacency lists, 

233
00:13:58,157 --> 00:14:02,279
well, the degree, on average, in the web 
is thought to be something like 10. 

234
00:14:02,279 --> 00:14:06,346
So the number of edges is only going to 
be something like 10 to the 11, 

235
00:14:06,346 --> 00:14:10,468
and then the adjacency list 
representation will be proportional to 

236
00:14:10,468 --> 00:14:12,974
that. 
And again, that's really pushing what we 

237
00:14:12,974 --> 00:14:16,261
can do with current technology, but it is 
within the limits. 

238
00:14:16,261 --> 00:14:20,828
So using that representation, we can do 
non trivial computations on graphs, even 

239
00:14:20,828 --> 00:14:22,500
at the scale of the web graph.