1
00:00:02,820 --> 00:00:06,698
The designer analysis of algorithms is 
the interplay between on the one hand 

2
00:00:06,698 --> 00:00:10,778
general principles and on the other hand 
its stantiations of those principles to 

3
00:00:10,778 --> 00:00:13,700
solve specific problems. 
While there's no silver bullet in 

4
00:00:13,700 --> 00:00:17,830
algorithm design, no one technique which 
solves every computational problem that's 

5
00:00:17,830 --> 00:00:21,658
ever going to come up. There are general 
design principles which have proven 

6
00:00:21,658 --> 00:00:25,486
useful over and over again over the 
decades for solving problems that arise 

7
00:00:25,486 --> 00:00:29,163
in different application domains. 
Those, of course, are the principles that 

8
00:00:29,163 --> 00:00:32,437
we focus on in this class. 
For example, in part one we studied the 

9
00:00:32,437 --> 00:00:36,064
divide and conquer algorithm design 
paradigm, principles of graph search 

10
00:00:36,064 --> 00:00:39,626
amongst others. 
On the other hand, we study specific 

11
00:00:39,626 --> 00:00:44,391
substantiations of these techniques. So 
in part one, we studied divide and 

12
00:00:44,391 --> 00:00:48,568
conquer and how it applies to say, 
Strassen matrix multiplication, merge 

13
00:00:48,568 --> 00:00:51,892
short and quicksort. 
In graph search, we culminated with the 

14
00:00:51,892 --> 00:00:55,653
rightfully famous Dijkstra's algorithm 
for computing shortest paths. 

15
00:00:55,653 --> 00:00:59,636
This, of course, is useful not just, 
because as any card-carrying computer 

16
00:00:59,636 --> 00:01:04,062
scientist or programmer, you want to know 
about what these algorithms are and what 

17
00:01:04,062 --> 00:01:08,267
they do, but it also gives us a toolbox, 
a suite of four free primitives, which we 

18
00:01:08,267 --> 00:01:12,250
can apply to our own computational 
problems as a building block in some 

19
00:01:12,250 --> 00:01:15,036
larger program. 
Part two of the course will continue this 

20
00:01:15,036 --> 00:01:17,132
narrative. 
We'll learn very general algorithm 

21
00:01:17,132 --> 00:01:19,041
paradigms. 
Like greedy algorithms, dynamic 

22
00:01:19,041 --> 00:01:22,162
programming algorithms and many 
applications, including a number of 

23
00:01:22,162 --> 00:01:24,258
algorithms for the greatest hits 
compilation. 

24
00:01:24,258 --> 00:01:27,984
And in this video and the next, I want to 
whet your appetite for what's to come, by 

25
00:01:27,984 --> 00:01:31,570
plucking out two of the applications that 
we'll study in detail later in the 

26
00:01:31,570 --> 00:01:33,759
course. 
Specifically, in the dynamic programming 

27
00:01:33,759 --> 00:01:36,647
section of the course. 
First of all, for both of these problems, 

28
00:01:36,647 --> 00:01:40,512
I think their importance is self evident. 
I don't think I'll have to really discuss 

29
00:01:40,512 --> 00:01:43,819
why these are interesting problems. 
Why, in some sense, we really need to 

30
00:01:43,819 --> 00:01:46,474
solve these two problems. 
Secondly, these are quite tricky 

31
00:01:46,474 --> 00:01:50,049
computational problems. 
And I would expect that most of you do 

32
00:01:50,049 --> 00:01:53,571
not currently know good algorithms for 
these problems and it would be 

33
00:01:53,571 --> 00:01:56,680
challenging to design one. 
Third, by the end of this class you will 

34
00:01:56,680 --> 00:01:59,065
know efficient algorithms for both of 
these problems. 

35
00:01:59,065 --> 00:02:00,955
In fact, you'll know something much 
better. 

36
00:02:00,955 --> 00:02:04,465
You'll know general algorithm design 
techniques which solve as a special case 

37
00:02:04,465 --> 00:02:08,155
these two problems and have the potential 
to solve problems coming up in your own 

38
00:02:08,155 --> 00:02:11,021
projects as well. 
And one comment before we get started on 

39
00:02:11,021 --> 00:02:13,993
these two videos. 
They're both at a higher level than most 

40
00:02:13,993 --> 00:02:17,375
of the class, by which I mean there won't 
be any equations or math. 

41
00:02:17,375 --> 00:02:21,321
There won't be any concrete pseudo-code, 
and I'll be glossing over lots of 

42
00:02:21,321 --> 00:02:23,884
details. 
The point is just to convey the spirit of 

43
00:02:23,884 --> 00:02:27,778
what we're going to be studying, and to 
illustrate the range of applications of 

44
00:02:27,778 --> 00:02:35,160
the techniques that we're going to learn. 
So what I want to talk about first is 

45
00:02:35,160 --> 00:02:40,045
distributed shortest path routing and why 
it is fundamental to how the internet 

46
00:02:40,045 --> 00:02:42,854
works. 
So let me begin with a kind of non very 

47
00:02:42,854 --> 00:02:46,580
mathematical claim. 
I claim that we can usefully think of the 

48
00:02:46,580 --> 00:02:51,160
internet as a graph, as a collection of 
verticies and a collection of edges. 

49
00:02:51,160 --> 00:02:54,491
So this is clearly an, clearly an 
ambiguous statement. 

50
00:02:54,491 --> 00:02:57,134
There's many things I might mean as we'll 
discuss. 

51
00:02:57,134 --> 00:03:01,206
But here's the primary interpretation I 
want you to have for this particular 

52
00:03:01,206 --> 00:03:03,691
video. 
So to specify this, the vertices I intend 

53
00:03:03,691 --> 00:03:06,440
to be the end-hosts and the routers of 
the internet. 

54
00:03:06,440 --> 00:03:10,036
So machines that generate traffic, 
machines that consume traffic, and 

55
00:03:10,036 --> 00:03:13,050
machines that help traffic get from one 
place to another. 

56
00:03:13,050 --> 00:03:17,856
So the edges are going to be directed and 
they are meant to represent physical or 

57
00:03:17,856 --> 00:03:22,662
wireless connections indicating that one 
machine can talk directly to another one 

58
00:03:22,662 --> 00:03:27,780
by either a physical link between the two 
or a direct wireless connection. 

59
00:03:27,780 --> 00:03:32,572
So it's common that you'll have edges in 
both directions, so that if machine A can 

60
00:03:32,572 --> 00:03:37,131
talk to machine B directly, then also 
machine B can talk directly to machine A, 

61
00:03:37,131 --> 00:03:41,515
but you definitely want to allow the 
possibility of asymmetric communication. 

62
00:03:41,515 --> 00:03:46,250
So, for example, imagine I send an email 
from my Stanford account to one of my old 

63
00:03:46,250 --> 00:03:49,231
mentors at Cornell, where I did my 
graduate studies. 

64
00:03:49,231 --> 00:03:53,965
So this piece of data, this email, has to 
somehow migrate from my machine local at 

65
00:03:53,965 --> 00:03:56,771
Stanford to my mentor's machine over at 
Cornell. 

66
00:03:56,771 --> 00:04:00,623
So how does that actually happen? 
Well, initially there's a phase of, sort 

67
00:04:00,623 --> 00:04:04,519
of local transportation, so, this piece 
of data has to get from my local machine 

68
00:04:04,519 --> 00:04:08,366
to a place within the Stanford network 
that can talk to the rest of the world. 

69
00:04:08,366 --> 00:04:12,262
Just like if I was trying to travel to 
Cornell, I would have to first use local 

70
00:04:12,262 --> 00:04:16,208
transportation to get to San Francisco 
airport and only from there could I take 

71
00:04:16,208 --> 00:04:18,862
an airplane. 
So this machine from which data can 

72
00:04:18,862 --> 00:04:23,217
escape from the Stanford network to the 
outside world is called the gateway 

73
00:04:23,217 --> 00:04:25,968
router. 
The Stanford gateway router passes it on 

74
00:04:25,968 --> 00:04:28,718
to a networks, whose job is to cross the 
country. 

75
00:04:28,718 --> 00:04:33,016
So last I checked, the commercial 
internet service provider of Stanford was 

76
00:04:33,016 --> 00:04:37,715
Cogent so they, of course, have their own 
gateway router which can talk to the 

77
00:04:37,715 --> 00:04:41,846
Stanford one and vice versa. 
And of course, these two nodes and the 

78
00:04:41,846 --> 00:04:46,018
edges between them are just this tiny, 
tiny, tiny piece embedded in this massive 

79
00:04:46,018 --> 00:04:49,345
graph, comprising all the end hosts and 
routers of the internet. 

80
00:04:49,345 --> 00:04:53,412
So that's the main version of a graph 
that we're going to talk about in this 

81
00:04:53,412 --> 00:04:57,689
video, but let me just pause to mention a 
couple of other graphs that are related 

82
00:04:57,689 --> 00:05:00,700
to the internet, and quite interesting in 
their own right. 

83
00:05:00,700 --> 00:05:05,699
So one graph that is generated an 
enormous amount of an interest in study 

84
00:05:05,699 --> 00:05:10,158
is the graph induces by the web. 
So here, the vertices are going to 

85
00:05:10,158 --> 00:05:15,022
represent web pages and the edges which 
is certainly directed represent 

86
00:05:15,022 --> 00:05:18,960
hyperlinks. 
Not one web page points to another one. 

87
00:05:18,960 --> 00:05:25,900
So for example, my homepage is one node 
in this massive, massive graph. 

88
00:05:25,900 --> 00:05:31,411
And as you might expect, there is a link 
from my home page to the course page for 

89
00:05:31,411 --> 00:05:35,492
this class. 
It is of course essential to use directed 

90
00:05:35,492 --> 00:05:40,227
edges to faithfully model the web. 
There is for example, no directed edge 

91
00:05:40,227 --> 00:05:43,890
from this courses homepage to my own 
homepage at Stanford. 

92
00:05:43,890 --> 00:05:47,751
So the web really exploded around, you 
know, mid 90's, late 90's, So for the 

93
00:05:47,751 --> 00:05:51,667
past 15 plus years, there's been lots of 
research, about the web graph. 

94
00:05:51,667 --> 00:05:55,964
I'm sure you won't be surprised to hear 
that, you know, around the middle of the 

95
00:05:55,964 --> 00:06:00,152
last decade, people got extremely excited 
about properties of social networks. 

96
00:06:00,152 --> 00:06:03,415
Those, of course, can also be fruitfully 
thought of as graphs. 

97
00:06:03,415 --> 00:06:07,222
Here, the vertices are going to be 
people, and the lengths are going to 

98
00:06:07,222 --> 00:06:10,540
denote relationships. 
So, for example, friend relationships and 

99
00:06:10,540 --> 00:06:13,260
Facebook or the following relationship on 
Twitter. 

100
00:06:14,940 --> 00:06:19,601
So notice the different social networks 
may correspond to undirected or directed 

101
00:06:19,601 --> 00:06:22,364
graphs. 
Facebook for example corresponding to an 

102
00:06:22,364 --> 00:06:26,720
undirected graph, Twitter corresponding 
to a directed graph. 

103
00:06:26,720 --> 00:06:30,276
So let's now return to the first 
interpretation I wanted to focus on, 

104
00:06:30,276 --> 00:06:34,296
that where the vertices are in-hosts and 
routers and it does represent direct 

105
00:06:34,296 --> 00:06:37,955
physical or wireless connections 
indicating that two machines can talk 

106
00:06:37,955 --> 00:06:41,047
directly to each other. 
So going back to that graph, let's go 

107
00:06:41,047 --> 00:06:44,552
back to the story where I'm sending an 
email to somebody at Cornell. 

108
00:06:44,552 --> 00:06:48,830
And this data has to travel from my local 
machine to some local machine at Cornell. 

109
00:06:48,830 --> 00:06:53,086
So, in particular, this piece of data has 
to get from the Stanford gateway router, 

110
00:06:53,086 --> 00:06:57,237
in effect to the airport for Stanford's 
network to the Cornell gateway router. 

111
00:06:57,237 --> 00:07:00,110
So there will be landing airport over on 
Cornell's side. 

112
00:07:00,110 --> 00:07:04,207
Now it's not easy to figure out exactly 
out what the structure of the routes 

113
00:07:04,207 --> 00:07:08,410
between Stanford and Cornell look like. 
But one thing I can promise you is that 

114
00:07:08,410 --> 00:07:12,454
there is not a direct physical link 
between the Stanford gateway router and 

115
00:07:12,454 --> 00:07:15,913
the Cornell gateway router. 
Any route between the two is going to 

116
00:07:15,913 --> 00:07:18,840
comprise multiple hops. 
It will have intermediate stops. 

117
00:07:18,840 --> 00:07:21,704
And there's not going to be a unique such 
route. 

118
00:07:21,704 --> 00:07:26,656
So if you have the choice between taking 
one route which stops in Houston and then 

119
00:07:26,656 --> 00:07:31,131
Atlanta and then in Washington D.C., 
how would you compare that to one which 

120
00:07:31,131 --> 00:07:35,661
stops in Salt Lake City and Chicago. 
Well hopefully your first instinct and a 

121
00:07:35,661 --> 00:07:40,076
perfectly good idea is all else being 
equal, prefer the path that is in some 

122
00:07:40,076 --> 00:07:43,308
sense the shortest. 
Now in this context, shortest could mean 

123
00:07:43,308 --> 00:07:47,032
many things, and it's interesting to 
think about different definitions. 

124
00:07:47,032 --> 00:07:51,181
But for simplicity let's just focus on 
the fewest number of hops, equivalently 

125
00:07:51,181 --> 00:07:55,775
the fewest number of intermediate stops. 
Well, if we want to actually execute this 

126
00:07:55,775 --> 00:08:00,464
idea, we clearly need an algorithm that 
given a source and destination computes 

127
00:08:00,464 --> 00:08:04,853
the shortest path between the two. 
So hopefully you feel well equipped to 

128
00:08:04,853 --> 00:08:09,723
discuss this problem, because one of the 
highlights of part one of this class, was 

129
00:08:09,723 --> 00:08:14,233
the discussion of Dijkstra's shortest 
pathalum rhythm in a blazing fast of 

130
00:08:14,233 --> 00:08:18,280
limitation using heaps that run's in 
almost linear time. 

131
00:08:18,280 --> 00:08:21,917
We did mention one caveat when we 
discussed Dijkstra's algorithm mainly 

132
00:08:21,917 --> 00:08:25,815
that it requires all edge lengths to be 
non negative but in the context of 

133
00:08:25,815 --> 00:08:30,076
internet routing almost any edge metric 
you'd imagine using will satisfy this non 

134
00:08:30,076 --> 00:08:33,257
negativity assumption. 
There is, however, a serious issue with 

135
00:08:33,257 --> 00:08:37,493
trying to apply Dijkstra's shortest path 
algorithm off the shelf to solve this 

136
00:08:37,493 --> 00:08:41,890
distributed internet routing problem, and 
the issue was caused by the just massive, 

137
00:08:41,890 --> 00:08:46,234
distributed scale of modern day internet. 
Probably back in the 1960's, when you had 

138
00:08:46,234 --> 00:08:50,363
the 12-note ARPANET, you could get away 
with running Dijkstra's shortest path 

139
00:08:50,363 --> 00:08:52,829
algorithm, but not in the twenty-first 
century. 

140
00:08:52,829 --> 00:08:56,690
It's not feasible for the Stanford 
gateway router to maintain locally a 

141
00:08:56,690 --> 00:08:59,640
reasonably accurate model of the entire 
internet graph. 

142
00:08:59,640 --> 00:09:03,498
So how can we elude this issue? 
Is it fundamental that because the 

143
00:09:03,498 --> 00:09:07,823
internet is so massive, it's impossible 
to run any shortest path algorithm? 

144
00:09:07,823 --> 00:09:12,383
Well, the ray of hope would be if we 
could have a shortest path algorithm that 

145
00:09:12,383 --> 00:09:16,650
admitted a distributed implementation. 
Whereby, a node could just interact, 

146
00:09:16,650 --> 00:09:21,268
perhaps iteratively, with its neighbors 
with the machines to which its directly 

147
00:09:21,268 --> 00:09:23,898
connected. 
And yet, somehow converge to having 

148
00:09:23,898 --> 00:09:26,880
accurate shortest paths to all of the 
destinations. 

149
00:09:26,880 --> 00:09:30,963
So perhaps, the first thing you'd try 
would be to seek out an implementation of 

150
00:09:30,963 --> 00:09:34,374
Dijkstra's algorithm, where each vertex 
uses only local computation. 

151
00:09:34,374 --> 00:09:37,269
That seems hard to do. 
If you look at the pseudo-code of 

152
00:09:37,269 --> 00:09:40,267
Dijkstra, it just doesn't seem like a 
localizable algorithm. 

153
00:09:40,267 --> 00:09:44,040
So instead, what we're going to do is 
we're going to learn a different shortest 

154
00:09:44,040 --> 00:09:45,849
path algorithm. 
It's also a classic. 

155
00:09:45,849 --> 00:09:48,124
Definitely on the greatest hits 
compilation. 

156
00:09:48,124 --> 00:09:52,173
It's called the Bellman-Ford Algorithm. 
So the Bellman-Ford algorithm, as you'll 

157
00:09:52,173 --> 00:09:55,070
see, can be thought of as a dynamic 
programming algorithm. 

158
00:09:55,070 --> 00:09:58,323
And indeed, it correctly computes 
shortest path using only local 

159
00:09:58,323 --> 00:10:00,966
computation. 
Each vertex only communicates in rounds 

160
00:10:00,966 --> 00:10:03,864
with the other vertices to which it's 
directly connected. 

161
00:10:03,864 --> 00:10:07,473
As a bonus, we'll see this algorithm also 
handles negative edge lengths. 

162
00:10:07,473 --> 00:10:09,658
Which of course, Dijkstra's algorithm was 
not. 

163
00:10:09,658 --> 00:10:12,047
But don't think Dijkstra's algorithm is 
obsolete. 

164
00:10:12,047 --> 00:10:15,860
It still has faster running time in 
situations where you can get away with 

165
00:10:15,860 --> 00:10:19,171
centralized computation. 
Now, it was really kind of amazing here 

166
00:10:19,171 --> 00:10:22,618
is that the Bellman-Ford algorithm, it 
dates back to the 1950's. 

167
00:10:22,618 --> 00:10:25,387
So, that's not just pre-internet, that 
pre-ARPANET. 

168
00:10:25,387 --> 00:10:29,117
So that's before the internet was even a 
glimmer in anybody's eye. 

169
00:10:29,117 --> 00:10:33,356
And yet, it really is the foundation for 
modern internet routing protocol's. 

170
00:10:33,356 --> 00:10:37,877
Needless to say, there's a lot of really 
hard engineering work and further ideas 

171
00:10:37,877 --> 00:10:42,511
required to translate the concept from 
Bellman-Ford to actually doing routing in 

172
00:10:42,511 --> 00:10:46,523
the very complex modern day internet. 
But yet, those protocol's at their 

173
00:10:46,523 --> 00:10:50,480
foundation, goes all the way back to the 
Bellman-Ford algorithm.