1 00:00:00,000 --> 00:00:03,964 In this video, we'll start developing the Bellman-Ford algorithm as an 2 00:00:03,964 --> 00:00:06,795 instantiation of the dynamic programming paradigm. 3 00:00:06,795 --> 00:00:11,212 We'll develop it in the usual way, by understanding how optimal solutions must 4 00:00:11,212 --> 00:00:16,060 necessarily be composed of optimal solutions to smaller sub problems. 5 00:00:16,060 --> 00:00:20,348 So, let me just quickly review what we learned from the subtleties of the 6 00:00:20,348 --> 00:00:23,461 previous video. In particular, let me be precise about 7 00:00:23,461 --> 00:00:28,043 what I mean by solving the single source shortest path problem when the input 8 00:00:28,043 --> 00:00:33,272 graph can have negative edge costs. So, the input, as usual, is a directed 9 00:00:33,272 --> 00:00:36,110 graph. Every edge has an edge cost, S sub E. 10 00:00:36,110 --> 00:00:41,900 We're allowing those costs to be negative and we're given a source for text s. 11 00:00:41,900 --> 00:00:45,609 So, what do we want to compute? Well, ideally we'd like to compute the 12 00:00:45,609 --> 00:00:49,809 shortest path distance from s to all other destinations v, where as usual, the 13 00:00:49,809 --> 00:00:53,620 length of the path is just the sum of the edge costs. 14 00:00:53,620 --> 00:00:58,112 Now, in the previous video, we discussed how this goal is problematic for input 15 00:00:58,112 --> 00:01:02,259 graphs that have a negative cycle. If you allow cycle paths that contain 16 00:01:02,259 --> 00:01:06,579 cycles, then this isn't even defined. The, in some sense, shortest-path length 17 00:01:06,579 --> 00:01:09,919 is minus infinity. And if you don't allow cycles, then it's 18 00:01:09,919 --> 00:01:12,972 computationally intractable, it's an N-P hard problem. 19 00:01:12,972 --> 00:01:16,773 So, if you fail to compute shortest-paths, then at least tell us an 20 00:01:16,773 --> 00:01:19,308 excuse. Output a negative cycle in the input 21 00:01:19,308 --> 00:01:23,860 graph as the reason for your failure to compute shortest-path distances. 22 00:01:23,860 --> 00:01:27,671 So, for this video, I'm just going to develop the Bellman–Ford algorithm for 23 00:01:27,671 --> 00:01:30,161 input graphs that do not contain negative cycles. 24 00:01:30,161 --> 00:01:33,871 And, of course, these are the cases in which we better actually output the 25 00:01:33,871 --> 00:01:38,109 correct shortest paths. Once we've got the Bellman–Ford algorithm 26 00:01:38,109 --> 00:01:42,394 up and running for input graphs that do not have negative cycles, we'll see that 27 00:01:42,394 --> 00:01:45,554 it's a fairly easy matter to extend it to the general case. 28 00:01:45,554 --> 00:01:49,732 And for input graphs that have negative cycles, we'll be able to output such a 29 00:01:49,732 --> 00:01:53,580 cycle without affecting the running time of the basic algorithm. 30 00:01:53,580 --> 00:01:57,273 So, with an eye toward a dynamic programming algorithm for the shortest 31 00:01:57,273 --> 00:02:00,498 path problem, let's start thinking about optimal sub-structure. 32 00:02:00,498 --> 00:02:04,763 The way in which optimal solutions that it's shortest paths must necessarily be 33 00:02:04,763 --> 00:02:08,769 composed of optical solutions, that is shortest paths, to smaller sub-problems. 34 00:02:08,769 --> 00:02:13,034 Now, the formal optimal sub-structure lemma is going to be a little cumbersome 35 00:02:13,034 --> 00:02:16,415 to state, so let me just spend a sly telling you, you know, what the deal is, 36 00:02:16,415 --> 00:02:20,551 how to think about it. It's often tricky to figure out how to 37 00:02:20,551 --> 00:02:23,519 apply the dynamic programming paradigm to graph problems. 38 00:02:23,519 --> 00:02:27,581 One of the reasons for that is that graphs aren't really naturally sequential 39 00:02:27,581 --> 00:02:30,029 objects. They're not ordered in any obvious way. 40 00:02:30,029 --> 00:02:34,195 We're just given some unordered set of vertices and some unordered set of edges. 41 00:02:34,195 --> 00:02:38,257 Now, one special case, of course, would be path graphs, our very first example of 42 00:02:38,257 --> 00:02:41,330 dynamic programming. There, there was an obvious ordering on 43 00:02:41,330 --> 00:02:44,558 the vertices of the graph. But unlike say, alignments, sequences 44 00:02:44,558 --> 00:02:48,621 where again there's an obvious ordering from left to right, in a graph problem 45 00:02:48,621 --> 00:02:52,748 it's not so clear how to order things. For this particular graph problem, 46 00:02:52,748 --> 00:02:56,406 however, it seems like we've got something going for us, which is the 47 00:02:56,406 --> 00:02:58,235 sequentiality of our output, right? 48 00:02:58,235 --> 00:03:02,753 Paths are certainly sequential objects so that gives us hope we can state and prove 49 00:03:02,753 --> 00:03:06,679 an optimal sub-substructure lemma talking about how optimal, optimal solution 50 00:03:06,679 --> 00:03:10,660 shortest paths must be built up from, in some sense, smaller shortest paths. 51 00:03:11,920 --> 00:03:15,540 Unfortunately, it's still far from obvious how to really define smaller and 52 00:03:15,540 --> 00:03:18,340 larger subproblems. So, for example, you'd love to have some 53 00:03:18,340 --> 00:03:21,719 intelligent ordering on which you process the possible destinations v. 54 00:03:21,719 --> 00:03:25,339 But it's not at all clear how to do that without knowing the shortest path 55 00:03:25,339 --> 00:03:28,332 distances in the first place. So, this is a subtle issue that I 56 00:03:28,332 --> 00:03:32,049 encourage you to think hard about in the privacy of your own home so you can 57 00:03:32,049 --> 00:03:35,660 better appreciate what's non-trivial about the Bellman-Ford solution. 58 00:03:35,660 --> 00:03:39,970 So, the key idea in the Bellman–Ford algorithm, and it's a good one, is to 59 00:03:39,970 --> 00:03:44,337 introduce an additional parameter that gives us an unambiguous definition of 60 00:03:44,337 --> 00:03:47,457 sub-problem size. So, what this parameter is going to do 61 00:03:47,457 --> 00:03:51,880 for a given destination v, is it's going to control how many edges we allow in 62 00:03:51,880 --> 00:03:55,737 paths from the source s to this destination v that we're looking at. 63 00:03:55,737 --> 00:03:59,707 To explain this, let's look at the following green graph that has five 64 00:03:59,707 --> 00:04:02,260 vertices at the right-hand part of the slide. 65 00:04:03,440 --> 00:04:07,622 So, in the Bellman–Ford algorithm, we're going to have one sub-problem for each 66 00:04:07,622 --> 00:04:11,859 possible destination and each possible restriction on the number of edges in a 67 00:04:11,859 --> 00:04:14,111 path. So, for example, suppose we want, we're 68 00:04:14,111 --> 00:04:18,026 looking at s and we're looking at destination t and we think about paths 69 00:04:18,026 --> 00:04:21,727 that only have two edges or less. Well, in that case, the shortest path 70 00:04:21,727 --> 00:04:26,070 length subject to that constraint from s to t in this graph has a length of four. 71 00:04:26,070 --> 00:04:30,414 The bottom path, which has three edges is not an option, we're only permitting two 72 00:04:30,414 --> 00:04:32,667 edges or less in this current sub-problem. 73 00:04:32,667 --> 00:04:36,689 Now, if we bump up the edge budget to three, then the corresponding shortest 74 00:04:36,689 --> 00:04:40,956 path distance drops from four to three. Because all of a sudden, we can make use 75 00:04:40,956 --> 00:04:44,688 of this 3-hop path on the bottom. And again, the point here is that it 76 00:04:44,688 --> 00:04:47,447 gives us an unambiguous notion of sub-problem size. 77 00:04:47,447 --> 00:04:51,783 The more edges you are allowed to use in your paths from a source to a given 78 00:04:51,783 --> 00:04:55,767 destination, the bigger that sub-problem. Alright. 79 00:04:55,767 --> 00:04:59,872 So, that discussion was deliberately vague to give you some context for the 80 00:04:59,872 --> 00:05:03,960 following formal statement of the optimal substructure lemma. 81 00:05:03,960 --> 00:05:07,770 So, we're going to go ahead and state and prove this optimal substructure lemma in 82 00:05:07,770 --> 00:05:10,325 full generality. We're going to be working with arbritary 83 00:05:10,325 --> 00:05:12,611 input graphs. They might have a negative cycle, they 84 00:05:12,611 --> 00:05:15,808 might not. So, like all of these optimal 85 00:05:15,808 --> 00:05:19,654 substructure lemmas, the statement is going to have the form that the optimal 86 00:05:19,654 --> 00:05:23,799 solution to a sub-problem has to lie, has to be one of a small number of candidates 87 00:05:23,799 --> 00:05:27,145 that are composed in simple ways from optimal solutions to smaller 88 00:05:27,145 --> 00:05:29,742 sub-problems. So, how do we index a given sub-problem? 89 00:05:29,742 --> 00:05:32,739 Well there's going to be a destination v that we care about. 90 00:05:32,739 --> 00:05:36,435 And, as we said in the last slide, there's going to be a budget on how many 91 00:05:36,435 --> 00:05:40,331 edges you're allowed to use in a path from s to v, and we're going to use i to 92 00:05:40,331 --> 00:05:44,304 denote that budget. So, for every possible destination v and 93 00:05:44,304 --> 00:05:47,808 for every possible edge budget, there's going to be a positive integer, 94 00:05:47,808 --> 00:05:51,799 one or bigger. Suppose that P is an optimal solution. 95 00:05:51,799 --> 00:05:57,936 Meaning, amongst all paths that start at s that end at v and that contain at most 96 00:05:57,936 --> 00:06:01,346 i edges, amongst all of those paths, P has the 97 00:06:01,346 --> 00:06:05,220 minimum sum of edge cost. The minimum length. 98 00:06:05,220 --> 00:06:09,888 So, a subtle point. Because we're proving this limit at its full generality, that 99 00:06:09,888 --> 00:06:14,675 is also for input graphs that might have negative cycles, we need to go ahead and 100 00:06:14,675 --> 00:06:16,921 allow this path P to use cycles if it wants, 101 00:06:16,921 --> 00:06:20,289 including potentially to use a negative cycle multiple times. 102 00:06:20,289 --> 00:06:23,658 That is allowed. Notice, we're not worried about this path 103 00:06:23,658 --> 00:06:28,563 using a cycle an infinite number of times because we have a finite budget i on how 104 00:06:28,563 --> 00:06:34,169 many edges it can contain. So with that setup, what are the possible 105 00:06:34,169 --> 00:06:37,247 candidates for what capital P could possibly be? 106 00:06:37,247 --> 00:06:41,095 Well, we're going to have our usual sort of trivial case one. 107 00:06:41,095 --> 00:06:45,970 Which is that, if this path P doesn't bother to use up it's full edge budget, 108 00:06:45,970 --> 00:06:51,100 that is if it has i - 1 edges or less, well then naturally P better be the 109 00:06:51,100 --> 00:06:54,500 shortest s,v path that has at most i - 1 edges. 110 00:06:55,840 --> 00:07:00,096 So, the non-trivial case is when the shortest path with the most i edges from 111 00:07:00,096 --> 00:07:03,467 s to v actually uses its full budget, uses all i of its edges. 112 00:07:03,467 --> 00:07:07,557 So now, by analogy, with all of our previous dynamic programming algorithms 113 00:07:07,557 --> 00:07:11,537 where we think about plucking off the final part of an optimal solution. 114 00:07:11,537 --> 00:07:14,743 Here, we're going to pluck off the final edge from the path P. 115 00:07:14,743 --> 00:07:17,728 What do we get? Well, we get a path with one fewer edge. 116 00:07:17,728 --> 00:07:20,657 So, that's good. It's going to correspond to some smaller 117 00:07:20,657 --> 00:07:23,034 sub-problems because it has at most i - 1 edges. 118 00:07:23,034 --> 00:07:27,511 On the other hand, notice that if we take a path from s to v and we pluck off the 119 00:07:27,511 --> 00:07:30,570 final edge, we get a path from s to somewhere else. 120 00:07:30,570 --> 00:07:33,280 So, we're going to call that somewhere else w. 121 00:07:34,540 --> 00:07:38,825 So, plucking off the final edge from capital P gives us a path P prime that 122 00:07:38,825 --> 00:07:42,425 starts at s that ends at w that has the most i - 1 edges. 123 00:07:42,425 --> 00:07:47,111 And I hope you can guess that the claim is that it's not just any old path from s 124 00:07:47,111 --> 00:07:51,540 to w the most i - 1 edges. It is, in fact, a shortest such path. 125 00:07:51,540 --> 00:07:56,141 Now, in this case two, notice that we of course know that P prime has exactly i - 126 00:07:56,141 --> 00:08:00,633 1 edges, not merely at most i - 1 edges. But, it's going to to be useful to have 127 00:08:00,633 --> 00:08:05,126 the stronger assertion claimed here that P prime is optimal amongst all paths that 128 00:08:05,126 --> 00:08:10,074 have i - 1 edges or possibly fewer. Alright. So, this is one of those lemmas 129 00:08:10,074 --> 00:08:12,440 that's actually harder to state than it is to prove. 130 00:08:12,440 --> 00:08:14,852 So let's just quickly, sort of, talk through the proof. 131 00:08:14,852 --> 00:08:18,960 It's the same as the previous optimal substructure lemmas that we've seen. 132 00:08:18,960 --> 00:08:23,684 Case one is totally trivial. It's the obvious contradiction that we've seen in 133 00:08:23,684 --> 00:08:27,803 many of our other case ones, Case two is going to be one of our usual 134 00:08:27,803 --> 00:08:32,900 cut-and-paste contradictions. So, suppose there is some path Q better 135 00:08:32,900 --> 00:08:36,225 than P prime. That is Q starts at s, ends at w, 136 00:08:36,225 --> 00:08:42,578 contains i - 1 edges or less, and the sum of its edge costs is even smaller than 137 00:08:42,578 --> 00:08:47,399 that of P prime. Well then, if we just tack on the final 138 00:08:47,399 --> 00:08:52,714 hop of P, that is the edge w,v, we get a path that starts at s that goes to v that 139 00:08:52,714 --> 00:08:56,717 has the most i edges. And the total sum of edge costs in this 140 00:08:56,717 --> 00:09:01,048 path Q plus w,v is strictly less than that in the original path P. 141 00:09:01,048 --> 00:09:05,576 But that contradicts the proported optimality of P amongst all paths 142 00:09:05,576 --> 00:09:09,900 starting at s ending at v and having the most i edges. 143 00:09:09,900 --> 00:09:13,617 So that's the proof the optimal sub-structured lemma, it's simple enough. 144 00:09:13,617 --> 00:09:17,129 And as usual, the next step is going to be to compile this lemma into a 145 00:09:17,129 --> 00:09:21,157 recurrence, where informally what the recurrence is going to do is brute force 146 00:09:21,157 --> 00:09:24,462 search amongst the possible candidates for the optimal solution. 147 00:09:24,462 --> 00:09:27,921 So, to make sure you understand what just happened, let's move onto a quiz. 148 00:09:27,921 --> 00:09:34,270 And the question is, for a given destination v, of some input graph, how 149 00:09:34,270 --> 00:09:40,935 many candidates are there for the optimal solution of a sub-problem that involves 150 00:09:40,935 --> 00:09:49,670 v? The correct answer is the second one. 151 00:09:49,670 --> 00:09:53,776 The answer depends on which destination v you're talking about. 152 00:09:53,776 --> 00:09:58,207 And what governs the number of sub-problems is the n degree of this 153 00:09:58,207 --> 00:10:01,334 vertex, that is the number of edges of the input 154 00:10:01,334 --> 00:10:05,717 graph that have v as their head. Why is this true? 155 00:10:05,717 --> 00:10:08,823 Well, case one contributes one possible candidate. 156 00:10:08,823 --> 00:10:13,450 It's possible that for a given i and a given v, all you do is inherit the 157 00:10:13,450 --> 00:10:18,600 optimal solution to destination v with a budget of one pure edge. 158 00:10:18,600 --> 00:10:23,278 Case two may seem like only one other candidate, but actually case two 159 00:10:23,278 --> 00:10:28,291 comprises a number of them, one for each choice of that final hop w,v. 160 00:10:28,291 --> 00:10:33,238 Specifically, for a given choice of w, that contributes a candidate optimal 161 00:10:33,238 --> 00:10:38,719 solution, that's the shortest path from s to that choice of w that uses the most i 162 00:10:38,719 --> 00:10:41,660 - 1 edges, plus that edge w,v tacked on.