So why should we be content computing shortest paths from merely one source vertex to all possible destinations? What if we want to know shortest path distances from every vertex to every other vertex? The formal definition of the all pair shortest paths or APSP problem is as follows. You're given as usual, a directed graph G, with edges that have lengths C sub E. You can think, if you want, about the special case when all edges are not negative, but we're also going to be interested in the case when edges can have negative lengths as well. In contrast to the single source shortest path problem, there is no distinguished source vertex. And the goal of the problem is to compute for every pair of vertices u and v, the length of a shortest path starting at u and ending at v. Just as with the single source version of the problem, this isn't quite the full story. If the input graph g has a negative edge cycle, then either, depending on how you define shortest paths, the problem doesn't make sense or, it's computationally intractable. So, if there is a negative cost cycle, we're off the hook from having to compute shortest path distances, but we do need to correctly report in that case, that the graph contains a negative cost cycle. That's our excuse, our excuse, for not computing the correct shortest path lengths. So, you know what would make me really happy? If, when you see this problem, you thought to yourself, eh, don't we pretty much already have a rich enough toolbox to solve the all-pairs shortest path problem. If that's what you thought, that's a great thought and the answer, in many senses, is yes. So let's explore that idea, make it precise. In the following quiz, I'm going to ask you, suppose I gave you, as a black box, a subroutine that solves the single source shortest path problem correctly and quickly. How many times would you need to invoke that black box? That subroutine to correctly solve the all pairs shortest path problem. So, the correct answer is C. You need N indications of the single source shortest path sub-routine, or N is the number of vertices in the input graph. Why? Well if you designate an arbitrary vertex as a source for text S and run the provided sub-routine, it will compute for you shortest path distances from that choice of S to all the destinations. So that computes for you N, a, shortest path distances out of the N square that you're responsible for. All the shortest path distances that has this particular vertex S as the origin. So there's N different choices out of the possible origin. So, you just interate out of all those choices and hopefully provide the out rhythm for once. Each and boom, you've got the N squared shortest path distances that you're responsible for. So, should we be happy with this reduction? Should we be happy with this algorithm that simply runs a single source shortest path algorithm n times? Or do we expect to do better? Well, the answer's going to depend. It's going to depend on two factors. First of all, does the input graph have only non negative edge costs? Or does it more generally also have negative edge costs? The second thing we want to look at is whether the graph is sparse. Meaning m, the number of edges is relatively close to n. Or is it dense, meaning m is relatively close to n^2? So let's start with the case of just non negative edge costs. The reason it matters whether or not the edge costs are all non negative or not is it governs which single source shortest path team we get to use. So if, the happy case, all edge costs are not negative, then we get to use Dijkstra's algorithm as our work horse. And remember Dijkstra's algorithm is blazingly fast. Our heap based implementation of it Ray M time M log N. So if you run that N times you're running time is naturally N times M times log N. So in the sparse case this is going to be N squared log N. In the dense this is going to be N cubed log N. [SOUND] So for sparse graphs, this frankly is pretty awesome. You're not going to really do much better, than running Dijkstra n times, once for each choice of source, of the source vertex. The reason is, we're responsible for outputting n squared values, the shortest path distance for each pair uv of vertices. And so here, the running time is just that n squared, times an extra log, log factor. [SOUND] The situation for dense graphs, however, is much murkier. And it is an open question to this day whether there are algorithms fundamentally faster than cubic time for the all pair shortest paths problem in dense graphs. If you wanted to try to convince somebody that maybe you couldn't do better than cubic time, you might argue as follows. There's a quadratic number of shortest path distances that have to be computed. And for a given pair, u and v, the shortest path might well have a linear number of edges in it. So, surely, you can't compute the shortest path between one pair better than linear time. So then the quadratic number is going to have to be cubic time. However, I want to be clear. This is not a proof. This is just a wishy washy argument. Why is it not a proof? Well, for all we know, we can do some amount of work which is relevant simultaneously for lots of these shortest path problems. And you don't actually have to spend linear on average time for each. As a tale of inspiration, let me remind you about matrix multiplication. If you write down the definition of what it means to multiply two matrices, you look at the definition and it just seems like it's an obviously cubic problem. It just seems like by definition, you have to do a cubic amount of work. [SOUND] Yet, that intuition is totally wrong. Beginning with Strassen, and then many following algorithms, we now know that there are algorithms for matrix multiplication, fundamentally better than the naive, cubic time algorithm. If you have a, nontrivial approach to decomposing a problem, you can eliminate some of the redundant work, and do better, than the straightforward solution. Is a Strassen-like improvement possible for the all pair shortest paths problem? [SOUND] Nobody knows. Now let's discuss the general case in quick graphs that are allowed to have negative edge lengths. So in this case we cannot use Dikstra's algorithm as our workhorse, as our single source shortest path subroutine. We have to resort to the Bellman-Ford algorithm instead because that's the only one of the two that accommodates negative cost edges. Remember the Bellman-Ford algorithm is slower than Dikstras algorithm, the running time down that we proved was O of M times N. If we run that N times we get a running time of M times M squared. How good is the running time of N squared M? Well, if the graph is sparse, if M is theta of N, then this is cubic in N. And if the graph is dense, M is theta of N squared, we're now seeing our first ever for this course, quartic running time. Hey and I hope that you're not really especially happy with the cubic running time down for the sparse graph case, but now when we're talking about quartic running time, now this just really seems exorbitant. And so hopefully you're thinking there's gotta be a better approach to this problem than just running Bellman-Ford N-times in the dense graph case. Indeed, there is: the Floyd-Warshall algorithm, we'll start talking about it next video.