1
00:00:00,000 --> 00:00:03,620
Consider the following graph with five 
vertices. 

2
00:00:03,620 --> 00:00:07,280
I'll label their edges with their costs 
in blue. 

3
00:00:10,840 --> 00:00:15,165
So, let's step through the values of I, 
and since there's five vertices, I's 

4
00:00:15,165 --> 00:00:19,666
going to take on the value of zero one 
two three four, and let's look at what 

5
00:00:19,666 --> 00:00:22,940
the results of each round of subproblem 
computations is. 

6
00:00:24,700 --> 00:00:29,833
So in the base case when I equals zero as 
always the distance from s to itself is 

7
00:00:29,833 --> 00:00:34,716
zero and from all other verticies the sub 
problem value is defined as plus 

8
00:00:34,716 --> 00:00:38,590
infinity. 
Let me write down the recurrence again, 

9
00:00:38,590 --> 00:00:45,624
just in case you've forgotten it. 
So now we enter the main four loops and 

10
00:00:45,624 --> 00:00:50,660
we start with I equal one. 
So now, we just go through the vertices 

11
00:00:50,660 --> 00:00:54,180
in arbitrary order and evaluate the 
recurrence. 

12
00:00:54,180 --> 00:00:57,977
So node S is going to just inherit the 
solution from the previous step. 

13
00:00:57,977 --> 00:01:01,540
It's still happy with the empty path of 
total length zero. 

14
00:01:01,540 --> 00:01:07,197
Now, note v is certainly hoping not to 
have to inherit it's, solution of plus 

15
00:01:07,197 --> 00:01:10,063
infinity from the previous round, when 
I0. 

16
00:01:10,063 --> 00:01:11,873
equals zero. 
And indeed, when I1, equals 1, the 

17
00:01:11,873 --> 00:01:16,097
subproblem solution for the vertex v is 
going to be two. 

18
00:01:16,097 --> 00:01:20,547
That's'cause we can choose the last hop 
to be the edge s, v. 

19
00:01:20,547 --> 00:01:25,073
That has length two, 
and s's subproblem value, last iteration, 

20
00:01:25,073 --> 00:01:30,337
when I0, equals zero. 
By the same reasoning, x's new subproblem 

21
00:01:30,337 --> 00:01:35,754
value is going to be four cause we can 
choose the last half to be s coma x and 

22
00:01:35,754 --> 00:01:40,897
add the cost of that edge four to s's 
subproblem value last iteration of I 

23
00:01:40,897 --> 00:01:45,416
equals zero. 
Now the nodes w and t would love to throw 

24
00:01:45,416 --> 00:01:48,481
out their plus infinity solutions, and 
get something finite. 

25
00:01:48,481 --> 00:01:52,550
And you might be thinking that because v 
and x now have finite distances. 

26
00:01:52,550 --> 00:01:54,947
Those would propagate to the nodes w and 
t. 

27
00:01:54,947 --> 00:01:58,626
That will indeed happen, but we're 
going to have to wait till the next 

28
00:01:58,626 --> 00:02:00,799
iteration. 
We're going to have to wait till I2. 

29
00:02:00,799 --> 00:02:03,029
equals 2. 
The reason is, if you look at the code, 

30
00:02:03,029 --> 00:02:07,042
if you look at the recurrence, 
when we compute the subproblem value at a 

31
00:02:07,042 --> 00:02:09,996
given iteration i, 
we only make use of the subproblem 

32
00:02:09,996 --> 00:02:12,337
solutions from the previous iteration, I 
minus 1. 

33
00:02:12,337 --> 00:02:16,517
We do not use any of the updates that 
have already happened in the current 

34
00:02:16,517 --> 00:02:18,190
iteration, I. 
So because, when Izero. 

35
00:02:18,190 --> 00:02:23,847
equals zero, a of zero v and a of zero x 
were both plus infinity, we're going to 

36
00:02:23,847 --> 00:02:28,420
be stuck with a1w and a1t, both being 
plus infinity as well. 

37
00:02:29,780 --> 00:02:33,721
So now let's move on to the next 
iteration of the outer four loop, when I 

38
00:02:33,721 --> 00:02:36,512
equals two. 
the sub-problem solution at S is not 

39
00:02:36,512 --> 00:02:39,140
going to change. 
You're not going to do better than zero, 

40
00:02:39,140 --> 00:02:42,534
so it's going to stay that way. 
Similarly at the vertex V you're not 

41
00:02:42,534 --> 00:02:46,530
going to do better than two, so that's 
going to stay the same at this iteration. 

42
00:02:46,530 --> 00:02:49,669
Something interesting happens at the 
vertex x, however. 

43
00:02:49,669 --> 00:02:53,738
Now, in the recurrence, you certainly 
have the option of inheriting the 

44
00:02:53,738 --> 00:02:57,168
previous solution. 
So one option would be to say a of two, x 

45
00:02:57,168 --> 00:03:00,482
to be equal to four, but there's actually 
a better choice. 

46
00:03:00,482 --> 00:03:04,958
So the recurrence is going to come out to 
be even smaller, specifically if we 

47
00:03:04,958 --> 00:03:08,039
choose that last hop to be the unit cost 
arc from vx. 

48
00:03:08,039 --> 00:03:12,399
We add that unit cost to v sub-problem 
value of the last iteration when I equals 

49
00:03:12,399 --> 00:03:13,388
1. 
That was 2. 

50
00:03:13,388 --> 00:03:16,760
2 plus 1 equals 3, 
so that's going to be the new sub-problem 

51
00:03:16,760 --> 00:03:21,280
value at x, in this iteration of I equals 
2. 

52
00:03:21,280 --> 00:03:27,371
So as advertised, the updates to the 
vertices V and X in the rate, iteration 

53
00:03:27,371 --> 00:03:34,023
where I equals 1, now that I equals 2 get 
propagated on to the vertices W and T. 

54
00:03:34,023 --> 00:03:37,470
So W and T shed their plus infinity 
values, 

55
00:03:37,470 --> 00:03:43,540
and instead they pick up the values 4 and 
8, respectively. 

56
00:03:43,540 --> 00:03:47,737
Notice that I've labelled the vertex t 
with an 8, not with a 7. 

57
00:03:47,737 --> 00:03:51,934
I've computed a of 2, t to be 8. 
And the reason is, again, the same, 

58
00:03:51,934 --> 00:03:56,071
updates from this iteration, in 
particular the fact that x is dropped 

59
00:03:56,071 --> 00:04:00,748
from 4 to 3 do not get reflected at other 
nodes in this same iteration. 

60
00:04:00,748 --> 00:04:05,485
We have to wait until the next iteration 
of the outer fore loop before they 

61
00:04:05,485 --> 00:04:08,423
happen. 
So we're using the stale information of x 

62
00:04:08,423 --> 00:04:11,181
that when I equals 1, it's solution value 
was four. 

63
00:04:11,181 --> 00:04:16,920
That's the information we're using to 
update T solution value so it's 4 plus 4 

64
00:04:16,920 --> 00:04:20,750
or 8. 
So with penultimate iteration when I 

65
00:04:20,750 --> 00:04:25,663
equals 3, most things stay the same. 
At S and V at X and W, we actually 

66
00:04:25,663 --> 00:04:30,981
computed the shorted paths, so they will 
all just inherent the solution from the 

67
00:04:30,981 --> 00:04:36,216
previous iteration. 
But at vertex T, it will now make use of 

68
00:04:36,216 --> 00:04:42,373
the improved solution value at the vertex 
X from the iteration where I equals 2. 

69
00:04:42,373 --> 00:04:48,139
So it's F, its 8 gets updated to be a 7, 
reflecting the improvement at X, the 

70
00:04:48,139 --> 00:04:52,514
previous iteration. 
And at this point, we're actually done. 

71
00:04:52,514 --> 00:04:56,302
We've computed shortest paths to all 
destinations, but the algorithm doesn't 

72
00:04:56,302 --> 00:04:59,592
know that we're done yet. 
So it still executes the final iteration 

73
00:04:59,592 --> 00:05:03,381
of the outer for loop where I equals 4, 
but everybody just inherits their 

74
00:05:03,381 --> 00:05:07,780
solutions from the previous round. 
At that point the algorithm terminates. 

75
00:05:07,780 --> 00:05:11,426
For most of the dynamic programming 
algorithms that we've discussed, the 

76
00:05:11,426 --> 00:05:15,529
running time analysis has been trivial. 
the Bellman-Ford algorithm is a little 

77
00:05:15,529 --> 00:05:18,467
more interesting from a running time 
analysis perspective. 

78
00:05:18,467 --> 00:05:29,378
Please think about it in this quiz. 
So, the correct answer is b. 

79
00:05:29,378 --> 00:05:34,110
That is of all these running time bounds, 
the smallest bound, which is actually 

80
00:05:34,110 --> 00:05:35,930
correct. 
So, let me explain why. 

81
00:05:35,930 --> 00:05:40,662
Why it's the number of edges times the 
number of vertices and also comment on 

82
00:05:40,662 --> 00:05:45,840
some of the other options. 
So answer A quadratic and squared. 

83
00:05:45,840 --> 00:05:48,243
What this is, this is the number of 
sub-problems. 

84
00:05:48,243 --> 00:05:50,546
Right? 
So sub-problems are indexed by a choice 

85
00:05:50,546 --> 00:05:54,102
of I, which is somewhere between zero and 
n minus one, and a choice of a 

86
00:05:54,102 --> 00:05:58,558
destination v, there's n choices for each 
of those exactly n squared sub-problems. 

87
00:05:58,558 --> 00:06:02,214
If we only ever spent constant time 
evaluating a sub-problem, then indeed, 

88
00:06:02,214 --> 00:06:05,418
the running time of Bellman-Ford would be 
big O of n squared. 

89
00:06:05,418 --> 00:06:09,074
And in most dynamic programming 
algorithms we've discussed in this class, 

90
00:06:09,074 --> 00:06:12,128
indeed you only spend, constant time 
solving each sub-problem. 

91
00:06:12,128 --> 00:06:15,784
The one exception being the optimal 
binary search trees problem, where in 

92
00:06:15,784 --> 00:06:19,138
general we spent linear time. 
Here again, like optimal binary search 

93
00:06:19,138 --> 00:06:22,137
trees, 
We might spend more than constant time 

94
00:06:22,137 --> 00:06:25,730
solving the sub-problem. 
The reason being we have to do brute 

95
00:06:25,730 --> 00:06:29,913
force search through a list of candidates 
that might be super constant. 

96
00:06:29,913 --> 00:06:34,449
The reason is that each in, each arc that 
comes in to the destination v, provides 

97
00:06:34,449 --> 00:06:38,749
one candidate for what the correct 
solution to this sub-problem might be. 

98
00:06:38,749 --> 00:06:43,344
So remember, the number of candidates, we 
had a quiz on this, is proportional to 

99
00:06:43,344 --> 00:06:47,645
the n degree of the vertex v and that 
could be as large as n minus 1, linear, 

100
00:06:47,645 --> 00:06:52,549
the number of vertices. 
So that's why the running time of the 

101
00:06:52,549 --> 00:06:57,560
Bellman-Ford algorithm can, in general, 
be worse than N squared. 

102
00:06:57,560 --> 00:07:00,868
Another interesting answer is c, 
that it's cubic in n. 

103
00:07:00,868 --> 00:07:04,958
Indeed, big o of n3 cubed is a valid 
upper bound on the running time of the 

104
00:07:04,958 --> 00:07:08,808
Bellman-Ford algorithm, 
but it's not the sharpest possible upper 

105
00:07:08,808 --> 00:07:11,334
bound. 
So why is it a valid upper bound, as 

106
00:07:11,334 --> 00:07:14,402
discussed just now? 
There's a quadratic n2 squared number of 

107
00:07:14,402 --> 00:07:17,470
subproblems. 
How much work do you do per subproblem? 

108
00:07:17,470 --> 00:07:20,537
Well, it's proportional to the n degree 
of a vertex. 

109
00:07:20,537 --> 00:07:23,605
The biggest n degree of a vertex is n 
minus 1, big o of n. 

110
00:07:23,605 --> 00:07:28,538
So, linear work for each of the quadratic 
number of subproblems, resulting in cubic 

111
00:07:28,538 --> 00:07:32,579
running time. 
There is however a tighter, a better 

112
00:07:32,579 --> 00:07:37,517
analysis of the Bellman-Ford algorithm. 
Now, why is o of mn bigger than n cubed? 

113
00:07:37,517 --> 00:07:41,000
Well, what is m? 
In a sparse graph, m is going to be theta 

114
00:07:41,000 --> 00:07:44,165
of n and in a dense graph m is going to 
be m squared. 

115
00:07:44,165 --> 00:07:49,294
So with a dense graph its true big o of 
mn is no smaller then n cubed, but if its 

116
00:07:49,294 --> 00:07:53,540
not a dense graph then this really is an 
improved upper bound. 

117
00:07:53,540 --> 00:07:57,630
So, why does the bound hold? 
Well, think about all of the total amount 

118
00:07:57,630 --> 00:08:01,360
of work done over all of the subproblems 
in the following way. 

119
00:08:02,400 --> 00:08:06,872
So all we're going to do is take the 
amount of work done in a given iteration 

120
00:08:06,872 --> 00:08:11,703
of the outer for loop, and multiply it by 
n, the number of iterations of the outer 

121
00:08:11,703 --> 00:08:15,200
for loop. 
So how much work do we do in a given 

122
00:08:15,200 --> 00:08:20,760
iteration for a given choice of I? well 
it's just the sum of the N degrees. 

123
00:08:20,760 --> 00:08:24,642
When we consider the vertex V, we do our 
proportional to its n degree, 

124
00:08:24,642 --> 00:08:28,883
and we consider each vertex V once in the 
given iteration of the outer for loop. 

125
00:08:30,080 --> 00:08:36,010
But we know a much simpler expression for 
the sum of the n degrees. 

126
00:08:36,010 --> 00:08:39,938
The sum is simply equal to M, 
the number of edges. 

127
00:08:39,938 --> 00:08:45,320
In any directed graph, the number of 
edges is exactly equal to the sum of the 

128
00:08:45,320 --> 00:08:46,557
N degrees. 
One easy way to see that is take your 

129
00:08:46,557 --> 00:08:50,000
favorite directed graph, 
and imagine you plug the edges into the 

130
00:08:50,000 --> 00:08:53,066
graph, one at a time, starting from the 
empty set of edges. 

131
00:08:53,066 --> 00:08:57,100
Well, each time you plug in a new edge, 
obviously the number of edges in the 

132
00:08:57,100 --> 00:09:00,219
graph goes up by one. 
And also the N degree of exactly one 

133
00:09:00,219 --> 00:09:03,554
vertex goes up by one, 
whichever vertex happens to be the head 

134
00:09:03,554 --> 00:09:07,696
of the edge that you just stuck in. 
So therefore the sum of N degrees and the 

135
00:09:07,696 --> 00:09:12,321
sum of yet, the number of edges is always 
the same no matter what the directed 

136
00:09:12,321 --> 00:09:14,903
graph is. 
So that's why the total work is better 

137
00:09:14,903 --> 00:09:19,500
than N cubed. 
It's actually big O of M times N. 

138
00:09:24,920 --> 00:09:29,209
So a number of optimizations of the basic 
Bellman-Ford algorithm are possible. 

139
00:09:29,209 --> 00:09:32,864
Let me wrap up this video just with a 
quick one about stopping early. 

140
00:09:32,864 --> 00:09:36,730
See also a separate video about a less 
trivial space optimization of the 

141
00:09:36,730 --> 00:09:39,916
algorithm. 
The basic version of the algorithm, the 

142
00:09:39,916 --> 00:09:42,490
outer for loop runs for N minus one 
iterations. 

143
00:09:42,490 --> 00:09:46,380
Generally you don't need all of them. 
We already saw that in our simple 

144
00:09:46,380 --> 00:09:49,283
example, where the final iteration did no 
useful work. 

145
00:09:49,283 --> 00:09:52,570
It just inherited the solutions from the 
previous iteration. 

146
00:09:52,570 --> 00:09:56,032
So, in general, suppose at some 
iteration, earlier than the last one, so 

147
00:09:56,032 --> 00:09:59,784
say with the current index value of j, it 
just so happens that nothing changes. 

148
00:09:59,784 --> 00:10:03,727
So at every destination v, you just reuse 
the optimal solution that you recomputed 

149
00:10:03,727 --> 00:10:06,084
in the previous iteration of the outer 
for loop. 

150
00:10:06,084 --> 00:10:09,787
Well, then if you think about it, what's 
going to happen in the next iteration? 

151
00:10:09,787 --> 00:10:13,731
You're going to do the exactly the same 
set of computations with exactly the same 

152
00:10:13,731 --> 00:10:17,049
set of inputs, so you're going to get 
exactly the same set of results. 

153
00:10:17,049 --> 00:10:20,801
That is, it will be true again in the 
next iteration, but you will just inherit 

154
00:10:20,801 --> 00:10:24,552
the optimal solutions from the previous 
one, and that will keep happening over 

155
00:10:24,552 --> 00:10:29,140
and over again until the rest of time. 
So in particular, by the time you get to 

156
00:10:29,140 --> 00:10:33,144
the n minus 1th iteration of the out for 
loop, you will have exactly the same set 

157
00:10:33,144 --> 00:10:37,304
of solution values as you do right now. 
We've already proven that the results at 

158
00:10:37,304 --> 00:10:39,384
the end of iteration n minus one are 
correct. 

159
00:10:39,384 --> 00:10:41,620
Those are the real shortest path 
distances. 

160
00:10:41,620 --> 00:10:45,676
If you already have them in your grubby 
little hands now, well you may as well 

161
00:10:45,676 --> 00:10:49,420
abort the algorithm and return those as 
the final correct shortest path 

162
00:10:49,420 --> 00:10:49,940
distances.