Consider the following graph with five vertices. I'll label their edges with their costs in blue. So, let's step through the values of I, and since there's five vertices, I's going to take on the value of zero one two three four, and let's look at what the results of each round of subproblem computations is. So in the base case when I equals zero as always the distance from s to itself is zero and from all other verticies the sub problem value is defined as plus infinity. Let me write down the recurrence again, just in case you've forgotten it. So now we enter the main four loops and we start with I equal one. So now, we just go through the vertices in arbitrary order and evaluate the recurrence. So node S is going to just inherit the solution from the previous step. It's still happy with the empty path of total length zero. Now, note v is certainly hoping not to have to inherit it's, solution of plus infinity from the previous round, when I0. equals zero. And indeed, when I1, equals 1, the subproblem solution for the vertex v is going to be two. That's'cause we can choose the last hop to be the edge s, v. That has length two, and s's subproblem value, last iteration, when I0, equals zero. By the same reasoning, x's new subproblem value is going to be four cause we can choose the last half to be s coma x and add the cost of that edge four to s's subproblem value last iteration of I equals zero. Now the nodes w and t would love to throw out their plus infinity solutions, and get something finite. And you might be thinking that because v and x now have finite distances. Those would propagate to the nodes w and t. That will indeed happen, but we're going to have to wait till the next iteration. We're going to have to wait till I2. equals 2. The reason is, if you look at the code, if you look at the recurrence, when we compute the subproblem value at a given iteration i, we only make use of the subproblem solutions from the previous iteration, I minus 1. We do not use any of the updates that have already happened in the current iteration, I. So because, when Izero. equals zero, a of zero v and a of zero x were both plus infinity, we're going to be stuck with a1w and a1t, both being plus infinity as well. So now let's move on to the next iteration of the outer four loop, when I equals two. the sub-problem solution at S is not going to change. You're not going to do better than zero, so it's going to stay that way. Similarly at the vertex V you're not going to do better than two, so that's going to stay the same at this iteration. Something interesting happens at the vertex x, however. Now, in the recurrence, you certainly have the option of inheriting the previous solution. So one option would be to say a of two, x to be equal to four, but there's actually a better choice. So the recurrence is going to come out to be even smaller, specifically if we choose that last hop to be the unit cost arc from vx. We add that unit cost to v sub-problem value of the last iteration when I equals 1. That was 2. 2 plus 1 equals 3, so that's going to be the new sub-problem value at x, in this iteration of I equals 2. So as advertised, the updates to the vertices V and X in the rate, iteration where I equals 1, now that I equals 2 get propagated on to the vertices W and T. So W and T shed their plus infinity values, and instead they pick up the values 4 and 8, respectively. Notice that I've labelled the vertex t with an 8, not with a 7. I've computed a of 2, t to be 8. And the reason is, again, the same, updates from this iteration, in particular the fact that x is dropped from 4 to 3 do not get reflected at other nodes in this same iteration. We have to wait until the next iteration of the outer fore loop before they happen. So we're using the stale information of x that when I equals 1, it's solution value was four. That's the information we're using to update T solution value so it's 4 plus 4 or 8. So with penultimate iteration when I equals 3, most things stay the same. At S and V at X and W, we actually computed the shorted paths, so they will all just inherent the solution from the previous iteration. But at vertex T, it will now make use of the improved solution value at the vertex X from the iteration where I equals 2. So it's F, its 8 gets updated to be a 7, reflecting the improvement at X, the previous iteration. And at this point, we're actually done. We've computed shortest paths to all destinations, but the algorithm doesn't know that we're done yet. So it still executes the final iteration of the outer for loop where I equals 4, but everybody just inherits their solutions from the previous round. At that point the algorithm terminates. For most of the dynamic programming algorithms that we've discussed, the running time analysis has been trivial. the Bellman-Ford algorithm is a little more interesting from a running time analysis perspective. Please think about it in this quiz. So, the correct answer is b. That is of all these running time bounds, the smallest bound, which is actually correct. So, let me explain why. Why it's the number of edges times the number of vertices and also comment on some of the other options. So answer A quadratic and squared. What this is, this is the number of sub-problems. Right? So sub-problems are indexed by a choice of I, which is somewhere between zero and n minus one, and a choice of a destination v, there's n choices for each of those exactly n squared sub-problems. If we only ever spent constant time evaluating a sub-problem, then indeed, the running time of Bellman-Ford would be big O of n squared. And in most dynamic programming algorithms we've discussed in this class, indeed you only spend, constant time solving each sub-problem. The one exception being the optimal binary search trees problem, where in general we spent linear time. Here again, like optimal binary search trees, We might spend more than constant time solving the sub-problem. The reason being we have to do brute force search through a list of candidates that might be super constant. The reason is that each in, each arc that comes in to the destination v, provides one candidate for what the correct solution to this sub-problem might be. So remember, the number of candidates, we had a quiz on this, is proportional to the n degree of the vertex v and that could be as large as n minus 1, linear, the number of vertices. So that's why the running time of the Bellman-Ford algorithm can, in general, be worse than N squared. Another interesting answer is c, that it's cubic in n. Indeed, big o of n3 cubed is a valid upper bound on the running time of the Bellman-Ford algorithm, but it's not the sharpest possible upper bound. So why is it a valid upper bound, as discussed just now? There's a quadratic n2 squared number of subproblems. How much work do you do per subproblem? Well, it's proportional to the n degree of a vertex. The biggest n degree of a vertex is n minus 1, big o of n. So, linear work for each of the quadratic number of subproblems, resulting in cubic running time. There is however a tighter, a better analysis of the Bellman-Ford algorithm. Now, why is o of mn bigger than n cubed? Well, what is m? In a sparse graph, m is going to be theta of n and in a dense graph m is going to be m squared. So with a dense graph its true big o of mn is no smaller then n cubed, but if its not a dense graph then this really is an improved upper bound. So, why does the bound hold? Well, think about all of the total amount of work done over all of the subproblems in the following way. So all we're going to do is take the amount of work done in a given iteration of the outer for loop, and multiply it by n, the number of iterations of the outer for loop. So how much work do we do in a given iteration for a given choice of I? well it's just the sum of the N degrees. When we consider the vertex V, we do our proportional to its n degree, and we consider each vertex V once in the given iteration of the outer for loop. But we know a much simpler expression for the sum of the n degrees. The sum is simply equal to M, the number of edges. In any directed graph, the number of edges is exactly equal to the sum of the N degrees. One easy way to see that is take your favorite directed graph, and imagine you plug the edges into the graph, one at a time, starting from the empty set of edges. Well, each time you plug in a new edge, obviously the number of edges in the graph goes up by one. And also the N degree of exactly one vertex goes up by one, whichever vertex happens to be the head of the edge that you just stuck in. So therefore the sum of N degrees and the sum of yet, the number of edges is always the same no matter what the directed graph is. So that's why the total work is better than N cubed. It's actually big O of M times N. So a number of optimizations of the basic Bellman-Ford algorithm are possible. Let me wrap up this video just with a quick one about stopping early. See also a separate video about a less trivial space optimization of the algorithm. The basic version of the algorithm, the outer for loop runs for N minus one iterations. Generally you don't need all of them. We already saw that in our simple example, where the final iteration did no useful work. It just inherited the solutions from the previous iteration. So, in general, suppose at some iteration, earlier than the last one, so say with the current index value of j, it just so happens that nothing changes. So at every destination v, you just reuse the optimal solution that you recomputed in the previous iteration of the outer for loop. Well, then if you think about it, what's going to happen in the next iteration? You're going to do the exactly the same set of computations with exactly the same set of inputs, so you're going to get exactly the same set of results. That is, it will be true again in the next iteration, but you will just inherit the optimal solutions from the previous one, and that will keep happening over and over again until the rest of time. So in particular, by the time you get to the n minus 1th iteration of the out for loop, you will have exactly the same set of solution values as you do right now. We've already proven that the results at the end of iteration n minus one are correct. Those are the real shortest path distances. If you already have them in your grubby little hands now, well you may as well abort the algorithm and return those as the final correct shortest path distances.