We're now well positioned to state Johnson's algorithm in its full generality. The input as usual is a directed graph G. Each edge has a edge length C sub E, it could be positive or negative. The input graph G may or may not have a negative cost cycle. As usual, when we talk about solving a shortest path problem we mean that if there is no negative cost cycle, then we have no excuse. So we better correctly compute all of the desired shortest paths. In this case between all pairs of vertices. If there is a negative cost cycle, we're off the hook. We don't have to compute shortest paths, but we. need to correctly report that the graph does indeed have a negative cost cycle. Step one is the same hack that we used in the example. We want to make sure that there's a vertex from which we can reach everybody else. So let's just add a new one that by construction has that properties. So we add one new vertex to little s, we add N new edges. One from s to each of the original vertices and each of those new N edges has length zero. We're going to call this new, bigger graph g prime. The second step, just like in the example, is to run a shortest path computation using this new vertex S as the source. Since the graph G in general has negative etch cost, we have to resort to the Belmont Ford out rhythm to execute the shortest path computation. Recall that the Bellman-Ford Algorithm will do one of two things. Either it will correctly compute the shortest path distances that we're after, from the source vertex S to everybody else, or it will correctly report. That the graph it was fed, namely g prime, contains [SOUND] a negative cost cycle. Now, if g prime contains a negative cost cycle, that cycle has to be the original graph, g. Our, the new vertex, s, that we added, has no incoming arcs at all, so it certainly can't be on any cycle, [SOUND] so any negative cycle is the responsibility of the original graph, G. [SOUND] Therefore if Bellman-Ford finds us a negative cost cycle, we're free to just halt and correctly report that same result. So from here on out, we can safely assume that G and G prime have no negative cost cycle, and therefore that the Bellman-Ford algorithm did indeed correctly compute shortest path distances from S to everybody else. As in the example, we are going to use these shortest path distances, [SOUND] all of which are finite by construction. As our vertex weights, our P sub Vs. We then form the new edges lengths, the C primes, in the usual way. C prime of an edge E going from U to V is the original length CE plus the weight of the tail P sub U minus the weight of the head P sub V. In the example, we saw that the new edge length C prime are all non-negative. We will shortly prove that, that property holds in general. If you define your vertex weights in terms of shortest path distances from the new vertex S to everybody else in this augmented graph G Prime it is always the case that your new edge lengths would be non-negative. Assuming that for now, it makes sense to use Dijkstar's algorithm N times, once for each choice vertex to compute all pairs shortest paths in the graph G with the new edge length C prime. In a particular iteration of this four loop. Say, when we're using the vertex u as the source vertex. We're going to be computing n of the shortest path distances that we care about. From u to the various n possible destinations. And let's call those shortest paths computed by Dijkstra in this iteration d prime of u, v. So perhaps you're thinking that at this point we're done. Right? We transformed the edge links using re-weighting to make them non-negative and as we know once things are not negative end Dijjkstra and you're good to go. But there's one final piece of bookkeeping that we have to keep track of. So, the shortest path distances that we compute in step four, these D primes, these are the shortest path distances with respect to the modified costs, the C primes. And what we're responsible for outputting the shortest path distances with respect to the original links, the C's. But fortunately there's a very easy way to extract the true shortest path distances from. The D primes from the shortest path distances relative to the modified costs. We know that D primes are wrong, there computed with respect to the wrong costs. But we know exactly what they're off by. So D primes are off by exactly P sub U minus P sub V amount. So to go from the D primes back to shortest path distances that we really care about, we just need to subtract that term back off, and that's step five. [SOUND] That completes the description of Johnson's Algorithm which, in effect, is a reduction from the all pairs shortest path problem with general edge lengths to N plus one instances of the single source version of the problem, only one of which has general edge lengths, N of which have only non-negative edge lengths. [SOUND] Analyzing the running time of Johnson's algorithm is straight forward. Let's just look at the work done in each of the five steps. In step one, we just add one new vertex and n new edges, so that takes o of n time to accomplish. In step two we run the Bellman-Ford algorithm, so we know that takes O of M times N time. In step three, we have to compute the modified costs, but given the shortest path distances computed by Bellman Ford, that's constant time per edge or O of N time overall. In step four we went Dijkstra's out rhythm N times. And, the running time of one indication is N X login. In step five we do constant work for each choice of U and V, so O of N squared work overall. So you can see that step four dominates and the running time is equivalent to N invocations of Dijkstra's algorithm M times N times log N. As usual when discussing graph algorithms I am thinking about the graph as a least being weakly connected. I'm thinking of the number of edges M as being big omega of N. So why is this running time so cool? Well two reasons. The first reason is that for sparse graphs this solution blows away the previous algorithms that we had that could handle negative edge lengths. The two previous solutions that we knew, that you could use with graphs with negative edge lengths were either run Bellam Ford N times or used the Floyd-Warshall. And in sparse graphs where M is big O of N, both of those solutions run in cubic time or in cubed time. This solution for sparse graphs, when M is O of N is of, of N squared Times a log N factor, so it's way better than either using Bellman-Ford N times or using Floyd-Warshall. The second reason this running time is so cool is that it matches the running time we were getting already in the special case when edge links were non negative. so this is very different than how we were conditioned to think about the world when we talked about single source shortage path problems. Remember, in those problems, we have Dexter's algorithm which only handles non negative edge links but runs in near linear time and there's no algorithm known for single source shortest paths that's near linear that can handle negative edge links, the Bellman Ford algorithm certainly is not near linear running time. That's Big O of N times N. Johnson's Algorithm shows that while negative edge lengths are indeed an issue, they can be handled in one shot using just one invocation of Bellman-Ford. While the Bellman-Ford algorithm is indeed quite a bit slower than Dijkstra's algorithm, it's only roughly a factor of N slower. So one invocation of Bellman-Ford doesn't change the overall running time when we already have to do N invocations of Dijkstra's algorithm, even in the special case of the problem where all of the edge lengths are non-negative. So that is why for all pairs shortest paths we see a convergence in the performance of algorithms that solve the problem in general, and algorithms that only solve the problem in the special case of non-negative edge lengths. The missing piece to the correctness argument is understanding why in general the modified edge links, the C prime E's are guaranteed to be non-negative. We saw that they were non-negative in this specific example, but we have not yet proved it in general. We'll do that on the next slot. Assuming for the moment that, that is true, that the modified edge links are indeed non-negative, and therefore, when we invoke Dijkstra it will correctly compute shortest paths, we're pretty much done. In particular, recall that we had a quiz in the previous video where we analyzed the ramifications of re-weighting. And we saw that if you re-weight using particular vertex weights, some P sub Vs. Then, for a given choice of an origin u, and a given choice of a destination v. Every single uv path has its length changed by exactly the same amount, by a common amount. Namely, the difference between u as vertex weight p sub u. And the destination v as vertex weight, p sub v. So that means, when we invoke Dijkstra's algorithm in step four. Indeed, gets the shortest paths correct. The shortest path distances are incorrect. But we know exactly how much they're off by. They're off by Pu minus P sub v. And that is corrected for in step five. So that's why assuming correctness of Dykstra's algorithm Which in turn relies on non negativity of the modified edge links. The end result of Johnson's algorithm is indeed the correct shortest path distances. To finish the analysis of Johnson's algorithm all that remains is to prove the following claim. Which is that for every single edge of the graph, it's length after we do re weighting is non negative. So the proof in fact is not hard. It follows quite easily from properties of shortest path distances. So fix your favorite edge E, let's say going from U to V. The vertex weights are, by construction, just shortest path distances from the vertex little s. Well remember S is the extra source vertex that we added to the original input graph G. So by the way we constructed the graph G prime, there is at least one path from this auxiliary source vertex S to every other vertex. So there is some shortest path from S to U, let's call it capital P. By definition, the length of capital P has to be little p sub u. It's a simple matter to obtain from this path capital P, which goes from S to U. A path which goes from S all the way to V. Namely, just concatenates the edge UV as a final hop. The length of this path which goes from S to V is of course just the length incurred going from S to U, Which is just P sub U, plus the length of the final hop which is just the length of this edge, C sub U V. Now this is just some old path from S to V. The shortest path from S to V, can of course, only be shorter. And only piece of V, the vertex way to V, is by definition, the length of the shortest path from S to V. And now, we're good to go because the modified length of this edge C. That is C prime sub E is just the difference between the right hand side of this inequality and the left hand side. So that's why the difference is guaranteed to be non negative. Since this was an arbitrary edge it holds for all the edges. That completes the proof in the analysis of Johnson's algorithm.