1 00:00:00,427 --> 00:00:03,417 In this video we'll cover Johnson's very cool algorithm. 2 00:00:03,417 --> 00:00:07,636 It shows how to use the re-weighting technique we introduced in the last video 3 00:00:07,636 --> 00:00:12,229 to reduce the all pair shortest path problem in graphs that can have negative 4 00:00:12,229 --> 00:00:16,502 edge lengths to a single invocation of the Bellman-Ford shortest path algorithm 5 00:00:16,502 --> 00:00:19,920 followed by N invocations of Dijkstra's shortest path algorithm. 6 00:00:21,340 --> 00:00:25,978 We concluded last video daydreaming about this best case scenario, where we have a 7 00:00:25,978 --> 00:00:30,107 graph with negative edge links. But somehow we come up with these magical 8 00:00:30,107 --> 00:00:35,140 vertex weights that transforms all of the edge links to be non negative. 9 00:00:35,140 --> 00:00:39,790 And it turns out that the magical vertex weights which will realize this best case 10 00:00:39,790 --> 00:00:42,928 scenario are best computed by a shortest past algorithm. 11 00:00:42,928 --> 00:00:47,466 So before describing Johnson's algorithm in general let me just walk you through 12 00:00:47,466 --> 00:00:53,385 some of the steps in an example. So I've drawn a directed graph with six 13 00:00:53,385 --> 00:00:58,019 vertices and seven edges. I've annotated each edge with its cost in 14 00:00:58,019 --> 00:01:01,130 blue. Notice that some of the edges do indeed 15 00:01:01,130 --> 00:01:06,040 have negative costs so it is not currently an option to run Dijkstra's 16 00:01:06,040 --> 00:01:11,306 algorithm on this graph. There is no negative cycle however in 17 00:01:11,306 --> 00:01:15,464 this graph, it only has one cycle, the directed triangle at the top and it's 18 00:01:15,464 --> 00:01:18,402 overall cost is one. So we could run the Bellman Ford 19 00:01:18,402 --> 00:01:24,311 Algorithm, on this graph if we wanted. So our strategy is to compute a magical 20 00:01:24,311 --> 00:01:29,132 set of vertex weights so that when we transform the edge lengths using the 21 00:01:29,132 --> 00:01:34,211 re-weighting technique described in the previous video we wind up with a graph 22 00:01:34,211 --> 00:01:37,640 that now has only non-negative edge lengths. 23 00:01:37,640 --> 00:01:39,994 So where do these vertex weights come from? 24 00:01:39,994 --> 00:01:43,828 Well, the great idea in Johnson's algorithm is to compute them using a 25 00:01:43,828 --> 00:01:46,840 subroutine for the single-source, shortest path problem. 26 00:01:48,480 --> 00:01:53,719 To implement this, we need a well defined instance of the single source shortest 27 00:01:53,719 --> 00:01:57,860 path problem. In particular, we need a source vertex. 28 00:01:57,860 --> 00:02:01,225 So that's a pretty good idea. There's only one small problem with it, 29 00:02:01,225 --> 00:02:04,837 which is that when you pick your arbitrary source vertex it might not be 30 00:02:04,837 --> 00:02:08,697 able to reach all of the other vertices. And to get our magical vertex weights, 31 00:02:08,697 --> 00:02:11,864 we're really going to want finite shortest path distances from our 32 00:02:11,864 --> 00:02:15,625 arbitrary source to everybody else. For example, in the graph that I've drawn 33 00:02:15,625 --> 00:02:19,386 here on the slide, it doesn't matter which of the six vertices you choose as 34 00:02:19,386 --> 00:02:22,355 your source vertex. You will not be able to reach all of the 35 00:02:22,355 --> 00:02:25,445 other five. So how do we get around this issue? 36 00:02:25,445 --> 00:02:29,587 Well, with a simple hack. We're just going to add a new vertex, so 37 00:02:29,587 --> 00:02:34,053 a seventh vertex in this example. And we're going to connect this new 38 00:02:34,053 --> 00:02:39,037 vertex, which I'm just going to call S, to all of the original vertices with a 39 00:02:39,037 --> 00:02:44,038 direct arc of length zero. We are then going to compute shortest 40 00:02:44,038 --> 00:02:49,148 paths from this new artificial source vertex S to all of the vertices from the 41 00:02:49,148 --> 00:02:52,641 original graph. Notice that by construction that we're 42 00:02:52,641 --> 00:02:57,622 going to get the finite shortest path distance from S to all of the vertices 43 00:02:57,622 --> 00:03:01,180 we've installed a direct path from S to everybody else. 44 00:03:02,360 --> 00:03:07,320 Notice that because this new vertex S has no edges going into it, it's effectively 45 00:03:07,320 --> 00:03:12,038 invisible from the perspective of all of the original vertices in the graph G. 46 00:03:12,038 --> 00:03:16,575 So in particular, the shortest path distance between any pair of vertices U 47 00:03:16,575 --> 00:03:21,232 and V in the original graph is unchanged by this addition of the vertex S. 48 00:03:21,232 --> 00:03:25,527 Similarly, whether or not G has a negative cycle, it's unaffected by the 49 00:03:25,527 --> 00:03:30,750 addition of the vertex S. The next step of Johnson's Algorithm is 50 00:03:30,750 --> 00:03:36,267 to invoke a single source shortest path algorithm using this newly added vertex S 51 00:03:36,267 --> 00:03:40,097 as our source vertex. Now in this example, and in general, 52 00:03:40,097 --> 00:03:42,608 we're thinking about the case of negative edge lengths. 53 00:03:42,608 --> 00:03:46,215 So we're not going to be able to use Dijkstra's algorithm to solve this single 54 00:03:46,215 --> 00:03:49,365 source version of the problem. We're going to have to use the Bellman 55 00:03:49,365 --> 00:03:54,641 Ford algorithm to do this computation. So let's now go ahead and figure out what 56 00:03:54,641 --> 00:03:59,720 are the shortest path distances from S to the other six vertices in this graph on 57 00:03:59,720 --> 00:04:03,538 the slide. So let's start with a vertex A. 58 00:04:03,538 --> 00:04:06,444 What's the shortest path distance from S to A? 59 00:04:06,444 --> 00:04:11,371 Well, it's certainly no worse than zero. There's a path directly from S to A of 60 00:04:11,371 --> 00:04:14,466 length zero. And indeed in general all six of the 61 00:04:14,466 --> 00:04:18,880 shortest path distances that we compute will be zero or less. 62 00:04:18,880 --> 00:04:24,771 So the question, is there a path from S to A that is negative, that's better than 63 00:04:24,771 --> 00:04:26,938 zero? Well what are the options? 64 00:04:26,938 --> 00:04:31,369 We could go from s to c at length zero but then we'd pay four to get back to a. 65 00:04:31,369 --> 00:04:33,725 So that has length four, so that's no good. 66 00:04:33,725 --> 00:04:38,269 A little bit better but still not good enough would be to zip straight from s to 67 00:04:38,269 --> 00:04:42,420 b that has length zero, from b to c, now we're up to, now we're down to - one. 68 00:04:42,420 --> 00:04:45,449 But then we have to go from c to a and so we add four. 69 00:04:45,449 --> 00:04:48,478 So we get three. So we conclude that the shortest path 70 00:04:48,478 --> 00:04:51,900 from s to a is indeed the direct one hop path of length zero. 71 00:04:53,080 --> 00:04:56,393 Most of the other vertices are more interesting however. 72 00:04:56,393 --> 00:05:00,830 Think for example about vertex B. We could of course zip straight from S to 73 00:05:00,830 --> 00:05:04,557 B along the path of link zero. But we can get shorter than that. 74 00:05:04,557 --> 00:05:09,291 If we go first from S to A at link zero, and then from A to B, then that path has 75 00:05:09,291 --> 00:05:13,077 combined length minus two. And that is in fact the shortest path 76 00:05:13,077 --> 00:05:17,948 distance from S to B. The shortest path distance to C is just 77 00:05:17,948 --> 00:05:23,677 to take that same shortest path to B and then concatenate the edge of cost minus 78 00:05:23,677 --> 00:05:27,496 one form B to C. That is the shortest path from S to T 79 00:05:27,496 --> 00:05:33,013 goes S to A to B to C for a combined length of zero plus minus two plus minus 80 00:05:33,013 --> 00:05:37,644 one, minus three in all. So now let's move to the bottom of the 81 00:05:37,644 --> 00:05:40,263 graph, the vertex Z is pretty easy to think about. 82 00:05:40,263 --> 00:05:44,326 There's only one path in the entire graph, that's the direct one from S to Z, 83 00:05:44,326 --> 00:05:47,640 so Z's just going to have trivial shortest path distance of zero. 84 00:05:49,000 --> 00:05:52,567 So now, for the vertex, y, you got a bunch of different options. 85 00:05:52,567 --> 00:05:56,954 You could, of course, go straight from s to y with zero, but there's a lot of 86 00:05:56,954 --> 00:06:00,522 things better than that. You can also go from s to z, and then, 87 00:06:00,522 --> 00:06:05,318 along the minus four arc to y, that would give you a path of length minus four, but 88 00:06:05,318 --> 00:06:09,822 you can do even better than that by going first to a, then to b, then to c, and 89 00:06:09,822 --> 00:06:12,864 then to y. So that gives you a path whose edge costs 90 00:06:12,864 --> 00:06:15,496 are zero, minus two, minus one, and minus three. 91 00:06:15,496 --> 00:06:20,000 That gives you a combined total of minus six, the shortest path distance to y. 92 00:06:21,340 --> 00:06:25,874 Generally the shortest path to get to X, the best thing to do is to accumulate of 93 00:06:25,874 --> 00:06:29,626 all of the negative weight at the top of the graph, go via vertex C. 94 00:06:29,626 --> 00:06:33,097 And it's true you have to pay the cost two to get from C to X. 95 00:06:33,097 --> 00:06:37,576 You still wind up with an end length of minus one, which out performs the direct 96 00:06:37,576 --> 00:06:42,922 zero length path from X to S. Now, the brilliant insight in Johnson's 97 00:06:42,922 --> 00:06:47,349 Algorithm is that this shortest path computation is extremely useful. 98 00:06:47,349 --> 00:06:52,353 In fact, these computed shortest path distances are exactly the magical vertex 99 00:06:52,353 --> 00:06:56,973 weight, weights that we're seeking. They're going to transform all of the 100 00:06:56,973 --> 00:07:03,193 edge links from general to non-negative. So let's define the weight P sub V of a 101 00:07:03,193 --> 00:07:09,010 vertex V from the original graph, that is one of the six vertices in this example 102 00:07:09,010 --> 00:07:14,541 that we started with, as the shortest path distance we just computed from the 103 00:07:14,541 --> 00:07:20,893 extra vertex S to that vertex V. What I want to do next is see the effect 104 00:07:20,893 --> 00:07:23,898 of reweighting using these vertex weights. 105 00:07:23,898 --> 00:07:36,387 So to do that, let me redraw the example. So let's recall the formula for the 106 00:07:36,387 --> 00:07:40,370 re-weighting technique given vertex weights like these. 107 00:07:40,370 --> 00:07:46,236 You define the new length C prime E of an edge E say from U to V as its original 108 00:07:46,236 --> 00:07:51,885 length CE plus the length of its tail P sub U minus the weight of it's head, P 109 00:07:51,885 --> 00:07:56,412 sub V. So let's just do this computation for 110 00:07:56,412 --> 00:07:59,198 each of the seven edges. Let's start at the top. 111 00:07:59,198 --> 00:08:02,280 Edge A comma B. So we start with its original length: 112 00:08:02,280 --> 00:08:05,185 minus two. We add the weight of the tail, so we're 113 00:08:05,185 --> 00:08:08,208 adding zero. And then we subtract the weight of the 114 00:08:08,208 --> 00:08:11,824 head, so we're subtracting minus two. That is, we're adding two. 115 00:08:11,824 --> 00:08:16,625 So we get minus two plus two or zero for the new length C prime B for the edge A 116 00:08:16,625 --> 00:08:21,494 comma B. Similarly for the edge B, C, we take the 117 00:08:21,494 --> 00:08:27,362 original length -one, we add to it -two, and then we subtract -three but as we add 118 00:08:27,362 --> 00:08:33,960 three, then again it all cancels out, we get zero for the new cost of arc B, C. 119 00:08:33,960 --> 00:08:39,648 For the arc C comma A we take the original length four, we add minus three, 120 00:08:39,648 --> 00:08:43,722 we subtract zero. So the arc C comma A has a strictly 121 00:08:43,722 --> 00:08:51,118 positive shifted length that's now one. If we look at the arc C, X we take the 122 00:08:51,118 --> 00:08:55,572 original link two. We add -three and we subtract -one. 123 00:08:55,572 --> 00:09:01,680 So that again all cancels out and C, X has a new cost of zero. 124 00:09:01,680 --> 00:09:06,963 Same thing happens with the arc C, Y. We start with minus three, we add another 125 00:09:06,963 --> 00:09:11,520 minus three, we subtract minus six, that gives us zero. 126 00:09:11,520 --> 00:09:17,448 For the arc Z comma Y, we start with one, we add zero, we subtract minus one so we 127 00:09:17,448 --> 00:09:20,634 get a new cost of two on the arc Z comma X. 128 00:09:20,634 --> 00:09:26,489 Finally for the arc Z comma Y, we start with minus four, we add zero, we subtract 129 00:09:26,489 --> 00:09:31,380 minus six, that is, we add six, so that gives us a new length of two. 130 00:09:32,540 --> 00:09:37,671 So I don't expect you to have intuition or semantics for the computations that we 131 00:09:37,671 --> 00:09:41,802 just did; but, at least in this example the proof is in the pudding. 132 00:09:41,802 --> 00:09:46,246 We just used these shortest path distances as weights and re-weighting 133 00:09:46,246 --> 00:09:51,002 magically made all seven edges have non-negative edge lengths, they all have 134 00:09:51,002 --> 00:09:56,483 length either zero, one or two. So what we've now done, at lest in this 135 00:09:56,483 --> 00:10:00,156 example is realize the best case scenario we were dreaming about. 136 00:10:00,156 --> 00:10:04,450 Remember what the key point of the rewaiting technique video was, we pointed 137 00:10:04,450 --> 00:10:07,049 out that reweighting preserves shortest paths. 138 00:10:07,049 --> 00:10:11,513 If you have some vertex waits, some P sub Vs, you change all the edge lengths by 139 00:10:11,513 --> 00:10:14,508 reweighting. For every origin S and every destination 140 00:10:14,508 --> 00:10:18,350 T you shift the length of every S, T path by exactly the same amount. 141 00:10:18,350 --> 00:10:22,927 By P sub S minus P sub T, the difference between the vertex weights at the origin 142 00:10:22,927 --> 00:10:26,317 and the destination. So by changing all paths by exactly the 143 00:10:26,317 --> 00:10:29,600 same. Same amount you preserve which path is 144 00:10:29,600 --> 00:10:33,084 the shortest. So that's a cute party trick. 145 00:10:33,084 --> 00:10:36,354 It's not really clear or useful. But we're hoping that maybe by 146 00:10:36,354 --> 00:10:39,778 reweighting, in the shortest path preserving way, could allow us to 147 00:10:39,778 --> 00:10:43,878 transform the general edge links version of a shortest path problem, to the non 148 00:10:43,878 --> 00:10:46,109 negative edge links version of the problem. 149 00:10:46,109 --> 00:10:50,157 So that we're not stuck with the slower Bellman Ford algorithm, and instead we 150 00:10:50,157 --> 00:10:53,680 get to use the faster Dice Dijkstra's algorithm. 151 00:10:53,680 --> 00:10:57,734 And that's now exactly what we can get away with here, at least in this example. 152 00:10:57,734 --> 00:11:01,481 We did the transformation, the new graph has only non negative edge links. 153 00:11:01,481 --> 00:11:05,638 Now we can just run Dijkstra's algorithm once for each choice of the source vertex 154 00:11:05,638 --> 00:11:08,000 to compute all of the shortest path distances.