1 00:00:00,550 --> 00:00:02,300 Let's now proceed to the final algorithm for 2 00:00:02,300 --> 00:00:05,300 the all-pairs shortest path problem that we're going to discuss. 3 00:00:05,300 --> 00:00:08,510 It's called Johnson's algorithm, and the algorithm shows something pretty amazing. 4 00:00:08,510 --> 00:00:12,280 It shows that even in graphs with negative edge lengths, the all-pairs shortest path 5 00:00:12,280 --> 00:00:17,480 problem can effectively be reduced to just N indications of Dykstra's algorithm. 6 00:00:17,480 --> 00:00:21,290 Even though Dykstra's algorithm needs non-negative edge links. 7 00:00:21,290 --> 00:00:24,960 So, this video will discuss the key idea behind Johnson's algorithm, 8 00:00:24,960 --> 00:00:26,940 will then proceed to the algorithm itself. 9 00:00:26,940 --> 00:00:31,430 And this idea is a way of using vertex weights to re-weight shortest path links. 10 00:00:34,817 --> 00:00:38,799 When we began our discussion on the all pairs shortest path problem we observed 11 00:00:38,799 --> 00:00:42,480 that the problem reduces to end indications of a suitable sub routine for 12 00:00:42,480 --> 00:00:46,643 the single source version of the problem simply by iterating over all N choices for 13 00:00:46,643 --> 00:00:47,690 the source vertex. 14 00:00:50,100 --> 00:00:53,080 The running time you get from this reduction is evidently 15 00:00:53,080 --> 00:00:56,840 N times the running time of the single source sub routine. 16 00:00:56,840 --> 00:01:00,230 And the running time of the sub routine is going to depend on whether your graph has 17 00:01:00,230 --> 00:01:01,732 non negative edge lengths or not. 18 00:01:01,732 --> 00:01:04,950 If you're in the lucky case where the graph has non-negative edge lengths 19 00:01:04,950 --> 00:01:09,620 you get to use Dykstra's algorithm which runs blazingly fast almost linear time 20 00:01:09,620 --> 00:01:11,800 M the number of edges times log N. 21 00:01:11,800 --> 00:01:13,900 If you're dealing with a graph with the general edge lengths, 22 00:01:13,900 --> 00:01:15,310 then the running time was not so good. 23 00:01:15,310 --> 00:01:19,230 It would be N times the Bellming Ford running time which itself is N M. 24 00:01:21,970 --> 00:01:25,500 At the end of the day you get a running time of N M log N for 25 00:01:25,500 --> 00:01:29,760 the non-negative edges case, or N squared m for the general edges case. 26 00:01:32,254 --> 00:01:35,567 And now that we know about the [INAUDIBLE] algorithm which runs in big O of 27 00:01:35,567 --> 00:01:39,220 N cubes time, there aren't a whole lot of reasons to care about this solution 28 00:01:39,220 --> 00:01:42,255 in [INAUDIBLE] which runs Bellman-Ford end times except perhaps 29 00:01:42,255 --> 00:01:45,200 the fundamentally distributed nature of that computation. 30 00:01:47,335 --> 00:01:51,891 Johnson's algorithm remarkably shows that the same running time we get here for 31 00:01:51,891 --> 00:01:56,648 the non-negative edge limit case, N times N times log N, we can get equally well for 32 00:01:56,648 --> 00:01:58,930 graphs that have general edge lengths. 33 00:02:01,267 --> 00:02:06,755 Specifically, Johnson's algorithm shows that the alt pair shortest 34 00:02:06,755 --> 00:02:12,517 path problem in graphs with general edge lengths reduces to one indication 35 00:02:12,517 --> 00:02:19,020 of the Bellman-Ford algorithm followed by an indication of Dexter's algorithm. 36 00:02:19,020 --> 00:02:22,440 As we know, the Bellman-Ford algorithm runs in M times N time. 37 00:02:22,440 --> 00:02:28,320 Any indications to Dexter's Algorithm is going to be N times M times Log N. 38 00:02:28,320 --> 00:02:31,510 Putting it together we see that the N indications of Dexter Dominate 39 00:02:31,510 --> 00:02:32,590 just barely. 40 00:02:32,590 --> 00:02:36,580 So the overall running time we'll achieve is big O of N M log in. 41 00:02:36,580 --> 00:02:38,810 Exactly the same running time we were getting for 42 00:02:38,810 --> 00:02:40,330 the non-negative edge length of case. 43 00:02:42,710 --> 00:02:45,080 So this should probably sound a little too good to be true. 44 00:02:45,080 --> 00:02:49,550 How on Earth can we reduce the shortest path problem with general edge lengths to 45 00:02:49,550 --> 00:02:52,530 the special case where all the edge lengths are non-negative. 46 00:02:52,530 --> 00:02:55,780 Indeed this is a trick that we actually thought some about in part one for 47 00:02:55,780 --> 00:02:58,360 those of you who are alumni of that course. 48 00:02:58,360 --> 00:03:00,780 Suppose you have the single short sorted path routes and 49 00:03:00,780 --> 00:03:02,820 you've got a graph with some negative entries. 50 00:03:02,820 --> 00:03:06,920 A very natural instinct is to say, well come on, that can't be that big a deal. 51 00:03:06,920 --> 00:03:09,690 Let's just shift the edge a bit so they become non-negative. 52 00:03:09,690 --> 00:03:11,250 And then keep your passed. 53 00:03:11,250 --> 00:03:14,440 So if you have some graphs with the most edge link as minus five, 54 00:03:14,440 --> 00:03:17,970 why not just add five to the link of every single edge. 55 00:03:17,970 --> 00:03:20,470 Now you have a new graph, non-negative edge lengths, and 56 00:03:20,470 --> 00:03:21,760 you just run Dijkstra's algorithm. 57 00:03:21,760 --> 00:03:22,260 Big deal. 58 00:03:24,682 --> 00:03:28,350 So this quiz asked you to think about this edge shifting trick in general. 59 00:03:28,350 --> 00:03:29,350 Suppose you have a graph and 60 00:03:29,350 --> 00:03:32,440 there's a particular source S in destination T you're looking at. 61 00:03:32,440 --> 00:03:34,800 And suppose you add some constant number, 62 00:03:34,800 --> 00:03:38,210 capital M, to the length of every single edge of the graph. 63 00:03:38,210 --> 00:03:44,040 Under what conditions will that preserve the shortest path between S and T. 64 00:03:55,784 --> 00:04:00,058 All right, so the correct answer is C Of the listed conditions, 65 00:04:00,058 --> 00:04:04,889 the only one that guarantees that adding a constant M to every edge length 66 00:04:04,889 --> 00:04:09,640 preserves the shortest path between a pair of vertices S and T is that all 67 00:04:09,640 --> 00:04:14,790 of the paths between that pair of vertices have the same number of edges. 68 00:04:14,790 --> 00:04:17,220 Maybe the easiest way to see this is just with a simple example. 69 00:04:19,110 --> 00:04:22,060 So consider this simple pic network where there's an S and a T and 70 00:04:22,060 --> 00:04:24,040 there's two paths between them. 71 00:04:24,040 --> 00:04:27,930 One with two hops each of cost one and one with one hop of cost three. 72 00:04:27,930 --> 00:04:30,620 Now evidently in this graph the shortest 73 00:04:30,620 --> 00:04:32,190 one is the one that goes threw the vertex V. 74 00:04:32,190 --> 00:04:34,960 That one has length two the other one has length three. 75 00:04:34,960 --> 00:04:37,170 Now suppose we add a fixed constant, 76 00:04:37,170 --> 00:04:41,030 let's say the constant two to every one of these edges. 77 00:04:43,584 --> 00:04:46,810 In that case the two edges on top their lengths increase to three. 78 00:04:46,810 --> 00:04:47,570 The edge on the bottom, 79 00:04:47,570 --> 00:04:51,390 its length increases to five and you'll see that after we've shifted 80 00:04:51,390 --> 00:04:55,350 all the edge lengths we've actually changed what the shortest path is. 81 00:04:55,350 --> 00:04:57,600 It used to be the two hop path on top. 82 00:04:57,600 --> 00:05:01,400 Now it's the one hop path on bottom that has length five. 83 00:05:01,400 --> 00:05:04,110 The two hop path on top now has length six. 84 00:05:05,500 --> 00:05:08,965 So this shows that the answers A, B and D are all incorrect. 85 00:05:08,965 --> 00:05:10,095 Why is C correct. 86 00:05:10,095 --> 00:05:13,705 Well suppose it was the case that between the vertex S and 87 00:05:13,705 --> 00:05:17,460 T every single path in the graph had exactly the number of edges. 88 00:05:17,460 --> 00:05:18,925 Say every path had 12 edges. 89 00:05:18,925 --> 00:05:25,640 Well then when you add a constant to every edges length, every path between S and T. 90 00:05:25,640 --> 00:05:28,270 Their length goes up by exactly the same amount. 91 00:05:28,270 --> 00:05:31,950 It goes up by 12 times M if each of the paths have 12 edges in it. 92 00:05:31,950 --> 00:05:36,800 So if every single path between S and T goes up by exactly the same amount, 93 00:05:36,800 --> 00:05:40,040 well then the shortest path remains exactly the same. 94 00:05:40,040 --> 00:05:43,700 The ranking of the lengths of the various path that's exactly the same because we've 95 00:05:43,700 --> 00:05:47,700 added exactly the same number to all of them. 96 00:05:47,700 --> 00:05:50,920 So the broader point here is that if we want to find a transformation of 97 00:05:50,920 --> 00:05:54,900 the lengths of a graph that reduces the general edge length case to 98 00:05:54,900 --> 00:05:56,770 the non-negative edge length case, 99 00:05:56,770 --> 00:06:01,460 we need to aspire toward transformations that preserve what shortest paths are. 100 00:06:01,460 --> 00:06:03,430 And this transformation is not it. 101 00:06:06,122 --> 00:06:10,306 There's no reason for you to expect that there'd be any useful transformations 102 00:06:10,306 --> 00:06:14,737 of this form, any shortest path preserving transformations that are nontrivial but 103 00:06:14,737 --> 00:06:16,720 such transformations do exist. 104 00:06:16,720 --> 00:06:19,850 That's the re-weighting technique, which we'll start exploring in the next quiz. 105 00:06:21,990 --> 00:06:25,450 So in this quiz we're getting you to think about one of our usual directed graphs. 106 00:06:25,450 --> 00:06:28,630 Every edge has a length and the lengths can be arbitrary real numbers. 107 00:06:31,400 --> 00:06:32,530 Here is the twist. 108 00:06:32,530 --> 00:06:36,210 For each vertex v there's going to be a number p 109 00:06:36,210 --> 00:06:38,980 sub v associated with that vertex. 110 00:06:38,980 --> 00:06:42,580 Don't worry about where these P sub V's come from we'll discuss that later. 111 00:06:42,580 --> 00:06:44,100 Just think of it as any old number. 112 00:06:44,100 --> 00:06:46,010 Could be seven, could be minus five. 113 00:06:46,010 --> 00:06:47,600 Whatever, just one number per vertex. 114 00:06:50,980 --> 00:06:55,060 The key idea behind the reweighting technique is to use these end numbers 115 00:06:55,060 --> 00:06:57,630 one weight per vertex, P sub V. 116 00:06:57,630 --> 00:07:02,820 To use these end numbers to shift the edge lengths of the graph. 117 00:07:02,820 --> 00:07:07,860 I'm going to use C prime to denote that these shifted or transformed edge lengths. 118 00:07:07,860 --> 00:07:09,480 Here's the exact definition. 119 00:07:09,480 --> 00:07:13,080 Consider an arbitrary edge E of this graph G. 120 00:07:13,080 --> 00:07:17,410 Let's say the edge goes from the tail U to the head V. 121 00:07:17,410 --> 00:07:22,330 The new length, C prime of E, is defined as the original length CE. 122 00:07:22,330 --> 00:07:27,380 Plus the weights of these edges tail so plus P sub U, 123 00:07:27,380 --> 00:07:31,760 minus the weight associated with these edges head that is minus P sub V. 124 00:07:34,334 --> 00:07:38,650 So, for example, consider an edge from U to V, that original length is two. 125 00:07:38,650 --> 00:07:43,122 And let's say the weights associated with its tail and head are minus four and 126 00:07:43,122 --> 00:07:43,700 minus three. 127 00:07:43,700 --> 00:07:47,600 Remember, these vertex weights can be arbitrary numbers, positive or negative. 128 00:07:47,600 --> 00:07:50,150 Well then, the shifted weight, C prime, is going to be, 129 00:07:50,150 --> 00:07:54,600 by definition, two plus minus four. 130 00:07:54,600 --> 00:07:56,165 Minus, minus three. 131 00:07:56,165 --> 00:07:59,130 This of course is just equal to one. 132 00:08:01,950 --> 00:08:02,810 So that's the setup. 133 00:08:02,810 --> 00:08:04,400 Here's the quiz question. 134 00:08:04,400 --> 00:08:07,010 I want you to think about some fixed path, capital P. 135 00:08:07,010 --> 00:08:10,470 Let's say it starts at a vertex S and ends at a vertex T. 136 00:08:10,470 --> 00:08:12,940 Now capital P may or may not be the shortest path, I don't care. 137 00:08:12,940 --> 00:08:15,990 It's just any old path starting at S, ending at T. 138 00:08:15,990 --> 00:08:20,250 So suppose in the original network with the original edge length C sub E 139 00:08:20,250 --> 00:08:24,110 the total length of this path P was capital L. 140 00:08:24,110 --> 00:08:29,730 What is its new length L prime under these new edge lengths C prime. 141 00:08:40,231 --> 00:08:41,620 So the correct answer is C. 142 00:08:44,820 --> 00:08:47,400 Under the new edge links C prime, 143 00:08:47,400 --> 00:08:52,240 the path key's link does not in general remain the same but 144 00:08:52,240 --> 00:08:57,230 it shifts by a predictable amount namely difference between the node 145 00:08:57,230 --> 00:09:02,260 weight associated with S the origin of the path and T the destination of the path. 146 00:09:04,173 --> 00:09:07,220 This follows by a simple calculation. 147 00:09:07,220 --> 00:09:11,600 To the new length of the path P is of course just the sum 148 00:09:11,600 --> 00:09:13,400 over the modified edge lengths. 149 00:09:13,400 --> 00:09:15,710 The sum over the c primes of the edges in P. 150 00:09:17,760 --> 00:09:21,160 So now we just expand each term in terms of its definitions so remember 151 00:09:21,160 --> 00:09:26,060 the new edge length C prime of E is just the original length CE plus the weight 152 00:09:26,060 --> 00:09:31,290 associated with the edges tail, U minus the weight associated with its head, V. 153 00:09:33,460 --> 00:09:36,532 Now consider a vertex other than S or T on this path, 154 00:09:36,532 --> 00:09:39,070 some internal vertex on the path. 155 00:09:39,070 --> 00:09:42,400 So such a vertex is going to be the tail of exactly one edge of P. 156 00:09:42,400 --> 00:09:44,980 And it's going to be the head of exactly one edge of P. 157 00:09:44,980 --> 00:09:48,730 So its vertex weight is going to show up with a coefficient of plus one once. 158 00:09:48,730 --> 00:09:51,404 And that'll show up with the coefficient of minus one, one S, 159 00:09:51,404 --> 00:09:53,130 obviously those two terms will cancel. 160 00:09:54,510 --> 00:09:58,260 So the only vertex weights that matter when we evaluate the sum is the weight of 161 00:09:58,260 --> 00:10:01,190 the origin S and the weight of the destination, T. 162 00:10:01,190 --> 00:10:04,180 The S only shows up once and it shows up as a tail so 163 00:10:04,180 --> 00:10:07,170 it's vertex weight shows up with a plus one contribution. 164 00:10:07,170 --> 00:10:09,390 So that's where the Ps term comes from. 165 00:10:09,390 --> 00:10:13,060 Symmetrically the destination T only shows up once with a minus one coefficient. 166 00:10:13,060 --> 00:10:14,400 so that's where the minus T comes from. 167 00:10:16,650 --> 00:10:17,750 So when the dust settles, 168 00:10:17,750 --> 00:10:22,010 all that we're left with is the sum of the original edge links that we're using 169 00:10:22,010 --> 00:10:27,070 the notation capital L for, plus this shift term, plus Ps of S, minus Ps of T. 170 00:10:29,981 --> 00:10:30,737 All right, so 171 00:10:30,737 --> 00:10:35,147 now that we've slogged through the algebra let me explain to you why we went through 172 00:10:35,147 --> 00:10:39,500 that exercise, what is the potential application of this reweighting technic. 173 00:10:41,790 --> 00:10:44,170 So what we've just learned about the reweighting technic. 174 00:10:44,170 --> 00:10:49,010 Well, fix a graph and fix an origin vertex S and a destination vertex T. 175 00:10:49,010 --> 00:10:52,960 What we just learned is that when you reweight using these vertex weights, 176 00:10:52,960 --> 00:10:58,450 you shift every single ST path by exactly the same amount. 177 00:10:58,450 --> 00:11:01,840 The length of every ST path, shortest or otherwise, 178 00:11:01,840 --> 00:11:06,860 changes by exactly the quantity P of S minus P [INAUDIBLE] T. 179 00:11:09,180 --> 00:11:11,260 So this is now starting to sound pretty cool, 180 00:11:11,260 --> 00:11:12,910 because let's remember our first quiz. 181 00:11:12,910 --> 00:11:15,840 In our first quiz, we explored what happens when you just 182 00:11:15,840 --> 00:11:18,650 add a fixed amount capital end to every edge of a graph. 183 00:11:18,650 --> 00:11:21,960 We noticed it doesn't work in the sense that it can change the shortest path. 184 00:11:21,960 --> 00:11:23,290 Why can it change the shortest path. 185 00:11:23,290 --> 00:11:25,690 Well, different paths have a different number of edges. 186 00:11:25,690 --> 00:11:27,470 So if you add some fixed amount to each edge, 187 00:11:27,470 --> 00:11:29,240 you change different paths by different amounts. 188 00:11:29,240 --> 00:11:31,150 That can screw up the shortest path. 189 00:11:31,150 --> 00:11:35,530 If you're lucky and all of the paths between an S and a T have exactly the same 190 00:11:35,530 --> 00:11:38,220 number of edges, then you shift them all by exactly the same amount and 191 00:11:38,220 --> 00:11:40,080 you preserve the ordering of the paths. 192 00:11:40,080 --> 00:11:41,830 But what do we see here. 193 00:11:41,830 --> 00:11:45,310 We see that under no assumptions whatsoever about what the paths 194 00:11:45,310 --> 00:11:50,120 between S and T like, Note reweighting shifts every single path 195 00:11:50,120 --> 00:11:54,880 from ST by exactly the same quantity, the difference between PS and PT. 196 00:11:57,670 --> 00:12:02,880 Since reweighting changes the length of every path by exactly the same amount, 197 00:12:02,880 --> 00:12:05,570 that means it preserves the shortest path. 198 00:12:05,570 --> 00:12:08,830 Whatever was the shortest path previous is still the shortest path 199 00:12:08,830 --> 00:12:10,993 after we've done re-weighting. 200 00:12:12,455 --> 00:12:13,330 So what. 201 00:12:13,330 --> 00:12:14,750 Why is this interesting. 202 00:12:14,750 --> 00:12:16,770 Well, let's dream big. 203 00:12:16,770 --> 00:12:21,330 Suppose we could actually come up with vertex weights that 204 00:12:21,330 --> 00:12:25,830 transformed the problem into one that we don't know how to solve blazingly fast. 205 00:12:25,830 --> 00:12:28,610 Into one that we do know how to solve blazingly fast. 206 00:12:28,610 --> 00:12:32,440 That is, what if we can find vertex weights that change an arbitrary instance, 207 00:12:32,440 --> 00:12:35,310 that in general has negative edge lengths. 208 00:12:35,310 --> 00:12:40,608 Into a shortest path problem, where all of the edge links are now non-negative. 209 00:12:42,330 --> 00:12:46,522 So if such a transformation existed, I hope that its use would be clear, 210 00:12:46,522 --> 00:12:50,000 this would transform an instance of shortest path. 211 00:12:50,000 --> 00:12:52,600 Where it seems like we're stuck with the Bellman-Ford algorithm because we start 212 00:12:52,600 --> 00:12:53,810 with negative edge lengths. 213 00:12:53,810 --> 00:12:56,300 And it transforms it into an easier one, 214 00:12:56,300 --> 00:12:59,000 where we can apply Dijkstra's shortest path algorithm. 215 00:13:00,914 --> 00:13:04,480 Now, this best case scenario should sound like a free lunch to you. 216 00:13:04,480 --> 00:13:07,030 Why do we think we can do basically a trivial amount of work. 217 00:13:07,030 --> 00:13:09,470 Just the simple shift of the edge lengths. 218 00:13:09,470 --> 00:13:13,940 And transform a seemingly harder problem, shortest path with general edge lengths, 219 00:13:13,940 --> 00:13:17,120 into an easier one, shortest paths with non-negative edge lengths. 220 00:13:17,120 --> 00:13:20,460 The catch, of course, is to figure out which vertex weights to use. 221 00:13:20,460 --> 00:13:24,350 How do you know the magical edge weights that transform the general edge lengths 222 00:13:24,350 --> 00:13:25,950 into the non-negative ones. 223 00:13:25,950 --> 00:13:27,810 Actually, more fundamentally, why do we even think. 224 00:13:27,810 --> 00:13:30,100 think that these vertex weights exist. 225 00:13:30,100 --> 00:13:32,970 No matter how you pick the vertex weights, it's impossible to 226 00:13:32,970 --> 00:13:36,630 transform all of the edge links so that they become non-negative. 227 00:13:36,630 --> 00:13:39,460 Well, it turns out, as long as the graph has no negative cycle, 228 00:13:39,460 --> 00:13:42,265 which we know as essential to computer shortest paths. 229 00:13:42,265 --> 00:13:43,605 If there's no negative cycle, 230 00:13:43,605 --> 00:13:48,435 there do always exist, magical choices of the vertex width P sub V. 231 00:13:48,435 --> 00:13:53,375 That, transforms all of the edge links to B non negative. 232 00:13:53,375 --> 00:13:55,835 Now it's not trivial to computer them, you have to do some work. 233 00:13:55,835 --> 00:13:57,915 But it's not a prohibitive amount of work. 234 00:13:57,915 --> 00:14:03,300 In fact, one indication of the Bellman Ford Algorithm, will be sufficient. 235 00:14:03,300 --> 00:14:06,917 Johnson's algorithm puts those ideas together, that'll be the next video.