1 00:00:00,000 --> 00:00:04,009 So we've talked a bit about various running time optimizations of the basic 2 00:00:04,009 --> 00:00:07,228 Bellman-Ford algorithm. Now let's talk about optimizing the space 3 00:00:07,228 --> 00:00:11,260 required by the algorithm. To orient ourselves, lets just think 4 00:00:11,260 --> 00:00:16,000 about how much space is needed by the basic algorithm we've been studying thus 5 00:00:16,000 --> 00:00:24,151 far. All right, so the correct answer here is 6 00:00:24,151 --> 00:00:29,957 the first one, A. The space required by the basic 7 00:00:29,957 --> 00:00:33,595 Bellman-Ford algorithm is quadratic in the number of vertices. 8 00:00:33,595 --> 00:00:35,297 It is theta of n^2. Why? 9 00:00:35,297 --> 00:00:39,932 Well, the space is dominated by the 2D array that we fill in, and there is only 10 00:00:39,932 --> 00:00:43,805 a constant amount of space per sub-problem, and there is n-squared 11 00:00:43,805 --> 00:00:46,797 sub-problems. Number one index is the budget on how 12 00:00:46,797 --> 00:00:51,902 many edges we are allowed to use, 'I' ranges from 0 to n - 1 and then the other 13 00:00:51,902 --> 00:00:54,894 index is just a destination and there's n of those. 14 00:00:54,894 --> 00:00:59,530 So in this video, we will talk about how you can do better, how you can get away 15 00:00:59,530 --> 00:01:04,664 with linear space, not quadratic space. The initial observation is a simple one. 16 00:01:04,664 --> 00:01:09,193 Let's just return to the formula that we used to populate all of the entries of 17 00:01:09,193 --> 00:01:12,873 the array. So, we always take the, a bunch of a 18 00:01:12,873 --> 00:01:15,806 whole bunch of candidates, what are the candidates? 19 00:01:15,806 --> 00:01:20,910 Well either we can inherit the solution from last iteration ia, i - 1v or we can 20 00:01:20,910 --> 00:01:26,073 choose something to be the last hop wv and then we just paste together the best 21 00:01:26,073 --> 00:01:30,649 solution 2w from last iteration. So ai - 1w plus we pay for the last hop, 22 00:01:30,649 --> 00:01:34,345 the length of the edge wv. But here's the point, staring at this 23 00:01:34,345 --> 00:01:37,220 formula, which subproblem values do we care about? 24 00:01:37,220 --> 00:01:41,603 Well not that many of them and in particular the trait shared by all of the 25 00:01:41,603 --> 00:01:46,159 interesting candidates are they're all from the last iteration the first index 26 00:01:46,159 --> 00:01:50,600 and everything on the right hand side is always i - 1. 27 00:01:50,600 --> 00:01:56,092 As a consequence, as soon as we finished one batch of sub-problems for some value 28 00:01:56,092 --> 00:02:01,314 of i, we've computed aiv for all v, we can throw out the results of all of the 29 00:02:01,314 --> 00:02:05,247 previous rounds of sub-problems. Ai - 1, Ai - 2, and so on. 30 00:02:05,247 --> 00:02:10,197 All we need is the freshest batch, the most recent iteration, to correctly 31 00:02:10,197 --> 00:02:13,569 proceed. If we do that, the only sub problems 32 00:02:13,569 --> 00:02:18,181 we're keeping track of are the ones that we're filling in, in the current round i 33 00:02:18,181 --> 00:02:23,134 and we're remember the solutions from the previous round i - 1, that means at any 34 00:02:23,134 --> 00:02:27,347 given moment in time we're only keeping track of big o of n different sub 35 00:02:27,347 --> 00:02:30,986 problems. And this linear space bound is even more 36 00:02:30,986 --> 00:02:35,622 impressive than it might appear at first, and that's because if you think about the 37 00:02:35,622 --> 00:02:40,024 responsibility of our algorithm in this problem, we have to output a linear 38 00:02:40,024 --> 00:02:43,545 number of things, right? We're not just outputting one number. 39 00:02:43,545 --> 00:02:47,947 We're outputting n numbers, shortest path distance from s to every possible 40 00:02:47,947 --> 00:02:50,764 destination. So the space is really constant per 41 00:02:50,764 --> 00:02:55,007 statistic that we have to compute. One thing I'm not going to discuss 42 00:02:55,007 --> 00:02:57,850 explicitly but I think it would be a good exercise for you. 43 00:02:57,850 --> 00:03:01,514 Go back to all of the other dynamic programming algorithms we discussed, and 44 00:03:01,514 --> 00:03:05,129 of course by now there are many, and see which ones you can do a comparable 45 00:03:05,129 --> 00:03:07,635 optimization. Which sub-problems do you need to keep 46 00:03:07,635 --> 00:03:11,009 around to compute the new ones? And if you throw out all the ones that 47 00:03:11,009 --> 00:03:14,287 you're not going to need anymore, what does that reduce the space to? 48 00:03:14,287 --> 00:03:17,661 For many of the problems we've just studied, you're going to see a big 49 00:03:17,661 --> 00:03:21,916 improvement in the space required. But this would be a good time to pause 50 00:03:21,916 --> 00:03:25,975 and stop for a few seconds and think about possible negative consequences of 51 00:03:25,975 --> 00:03:29,139 this space optimization. So suppose we do this, either in the 52 00:03:29,139 --> 00:03:32,988 context of shortest path going forward, or some other dynamic programming 53 00:03:32,988 --> 00:03:35,149 algorithm. We throw out the values of old 54 00:03:35,149 --> 00:03:38,945 sub-problem solutions that we seemingly don't need any more to push the 55 00:03:38,945 --> 00:03:42,372 recurrence further. Are there any drawbacks to this space 56 00:03:42,372 --> 00:03:50,215 optimization? [SOUND] So the answer depends. 57 00:03:50,215 --> 00:03:54,239 [SOUND] If all you care about is the value of an optimal solution and not the 58 00:03:54,239 --> 00:03:58,212 optimal solution itself, then you may as well just throw out all of these old 59 00:03:58,212 --> 00:04:01,875 sub-problems and you don't need anymore to push the recurrence further. 60 00:04:01,875 --> 00:04:05,125 It's not going to hurt you at all. If, however, you want the optimal 61 00:04:05,125 --> 00:04:09,252 solution itself, and not just its value. Well, how did we actually accomplish that 62 00:04:09,252 --> 00:04:11,884 in the past? We had these reconstruction algorithms. 63 00:04:11,884 --> 00:04:14,051 How do the reconstruction algorithms work? 64 00:04:14,051 --> 00:04:17,714 You took the entire filled in table and you traced backward through it. 65 00:04:17,714 --> 00:04:21,790 At each entry of the filled in table you looked at which of the candidates won, 66 00:04:21,790 --> 00:04:26,151 which comparison came out the winner when you filled in this entry, that told you a 67 00:04:26,151 --> 00:04:29,735 piece of what the optimal solution has to look like and then you go backward 68 00:04:29,735 --> 00:04:33,551 through the table and repeat the process. If you've thrown out most of your filled 69 00:04:33,551 --> 00:04:35,082 in table. How are you going to run this 70 00:04:35,082 --> 00:04:37,819 reconstruction algorithm? In the context of the Bellman-Ford 71 00:04:37,819 --> 00:04:41,593 algorithm, how are you going to reconstruct shortest paths if you haven't 72 00:04:41,593 --> 00:04:45,160 remembered all of the sub-problems of all of the previous iterations. 73 00:04:46,180 --> 00:04:50,324 And you can then certainly imagine in a routing application for example, you 74 00:04:50,324 --> 00:04:54,578 don't just want to know that the shortest path has length seventeen, you want to 75 00:04:54,578 --> 00:04:57,960 know, what route should we take to get from point A to point B? 76 00:04:59,020 --> 00:05:03,346 So in the rest of this video, I'm going to describe a solution for the 77 00:05:03,346 --> 00:05:07,240 Bellman-Ford algorithm. I'm going to show you how we can retain 78 00:05:07,240 --> 00:05:11,691 the constant space per vertex guarantee while recovering the ability to 79 00:05:11,691 --> 00:05:16,062 reconstruct shortest paths. So, the idea is, for a given value of i 80 00:05:16,062 --> 00:05:21,047 and a given destination v, we're going to keep track of, not just the one piece of 81 00:05:21,047 --> 00:05:25,510 information, the length of the shortest path from s to v that uses the most I 82 00:05:25,510 --> 00:05:29,335 edges, but also a second piece of information, [SOUND] which is the 83 00:05:29,335 --> 00:05:33,624 second-to-last vertex on such a path. So it's still going to be just constant 84 00:05:33,624 --> 00:05:37,580 space per sub-problem. So we're going to call this second 85 00:05:37,580 --> 00:05:40,528 two-dimensional array b. We're going to call its entries 86 00:05:40,528 --> 00:05:43,754 predecessor pointers. Again, the semantics are this pointer 87 00:05:43,754 --> 00:05:47,870 points to the predecessor of this destination v on a shortest path from s 88 00:05:47,870 --> 00:05:51,653 to v that has the most i edges. Of course, if i is sufficiently small 89 00:05:51,653 --> 00:05:56,047 there may be no such paths from s to v, and in that case we just have a null 90 00:05:56,047 --> 00:06:00,813 predecessor pointer. So for a moment let's just forget about 91 00:06:00,813 --> 00:06:05,637 the space optimization that we're trying to attain and let's just observe that if 92 00:06:05,637 --> 00:06:10,049 we correctly computed these capital B's then simply traversing predecessor 93 00:06:10,049 --> 00:06:12,520 pointers would reconstruct shortest paths. 94 00:06:14,200 --> 00:06:15,942 So, why is this true? Well, two reasons. 95 00:06:15,942 --> 00:06:18,486 Reasons that you've seen explain other things as well. 96 00:06:18,486 --> 00:06:22,395 So, the first reason is remember that we're assuming that the input graph has 97 00:06:22,395 --> 00:06:26,410 no, no negative cost cycle. Therefore shortest paths have a most n - 98 00:06:26,410 --> 00:06:29,758 one edges. Therefore the bn - 1v is actually store 99 00:06:29,758 --> 00:06:35,339 in them the final hop of a shortest path period, with no edge budget from s to v. 100 00:06:35,339 --> 00:06:40,501 So in the bn - 1v is the final batch of predators or points, if we correctly 101 00:06:40,501 --> 00:06:45,038 compute them, they are telling us the last hop on a shortest path to v. 102 00:06:45,038 --> 00:06:49,136 Now, the other part of the correctness comes from the optimal substructure 103 00:06:49,136 --> 00:06:51,517 limbo. So, remember way back when we started 104 00:06:51,517 --> 00:06:55,892 talking about shortest paths and their optimal substructure, we said, well if 105 00:06:55,892 --> 00:07:00,433 only we had a little birdy which told us what the last hop of a shortest path was, 106 00:07:00,433 --> 00:07:04,863 then the shortest path would just be that last hop concatinated with a shortest 107 00:07:04,863 --> 00:07:09,182 path from s up to the penultimate vertex w and, these predecessor pointers are 108 00:07:09,182 --> 00:07:12,837 exactly this little birdy. We're storing at v what the last hop is, 109 00:07:12,837 --> 00:07:17,322 w, v and so, we know that it's just that last hop with the shortest path back to S 110 00:07:17,322 --> 00:07:19,986 from W. And by traversing, that's exactly what 111 00:07:19,986 --> 00:07:24,000 you do, you just reconstruct the rest of the path apart from s to w. 112 00:07:29,580 --> 00:07:33,754 So, if we correctly compute predecessor pointers then they enable the simple 113 00:07:33,754 --> 00:07:37,324 reconstruction of shortest path, just by traversal after the fact. 114 00:07:37,324 --> 00:07:41,334 So, that leaves of with the task of correctly computing these predecessor 115 00:07:41,334 --> 00:07:42,597 pointers. It's not hard. 116 00:07:42,597 --> 00:07:46,771 I think many of you could just fill in the details yourself, but let me just 117 00:07:46,771 --> 00:07:50,654 give quick rundown. So remember in general we have this 2-D 118 00:07:50,654 --> 00:07:54,750 array capital B, it's indexed by the edge budget I and it's indexed by the 119 00:07:54,750 --> 00:07:59,123 destination v and we're supposed to fill in biv with a final hop on a shortest 120 00:07:59,123 --> 00:08:03,441 path from s to v that uses at most I edges and if no such paths exist then 121 00:08:03,441 --> 00:08:06,430 it's just null. For the base case that's when I equals 122 00:08:06,430 --> 00:08:09,420 zero and here everybody's predecessor pointer is null. 123 00:08:10,860 --> 00:08:16,201 So the way we're going to fill in the entry biv is going to depend on which of 124 00:08:16,201 --> 00:08:20,761 the candidates won in the competition to be the shortest path for aiv. 125 00:08:20,761 --> 00:08:25,647 In essence this predecessor pointer biv is just caching the results of the 126 00:08:25,647 --> 00:08:29,100 competition we ran to find the shortest path for aiv. 127 00:08:29,100 --> 00:08:33,660 So to make this precise, let's just recall the formula that we used to 128 00:08:33,660 --> 00:08:38,220 compute for aiv, given the solutions to the last batch of sub-problems. 129 00:08:39,260 --> 00:08:44,660 Case one is when you inherit the solution from the last round ai minus 1v. 130 00:08:44,660 --> 00:08:49,792 In addition each possible choice for a last hop w, v furnishes another candidate 131 00:08:49,792 --> 00:08:54,207 for the optimal solution of this round in which your going to pay, you're going to 132 00:08:54,207 --> 00:08:58,862 pay the optimal solution to the sub problem i minus 1w plus the length of the 133 00:08:58,862 --> 00:09:02,409 edge wv. So we're going to fill in the biv array. 134 00:09:02,409 --> 00:09:07,316 Basically, just to reflect what happens when we computed the solution to AIV. 135 00:09:07,316 --> 00:09:13,257 In essence, what we're doing in the 2D, 2D array b is remembering the most recent 136 00:09:13,257 --> 00:09:17,970 vertex w that's responsible for furnishing a new shortest path from s to 137 00:09:17,970 --> 00:09:20,891 v. So in case one the boring case where at 138 00:09:20,891 --> 00:09:25,953 iteration I we just inherit the solution from the last round of course in the b 139 00:09:25,953 --> 00:09:29,980 entry we just inherit the last hop of the previous round. 140 00:09:29,980 --> 00:09:34,876 That is if we use case one to fill in aiv then we just sent biv to be, bi - 1b. 141 00:09:34,876 --> 00:09:39,965 So the interesting case is when aiv is filled in using case two of the formula, 142 00:09:39,965 --> 00:09:45,119 that is when this shortest path from s to v suddenly improves, given a budget of 143 00:09:45,119 --> 00:09:49,821 I-hops, as opposed to just, I-1hops. In that case we're just going to cash the 144 00:09:49,821 --> 00:09:54,331 results of the brute force search that we did to evaluate the formula. 145 00:09:54,331 --> 00:09:59,485 That is we justt remember the choice of the pan ultimate vertex W, that achieved 146 00:09:59,485 --> 00:10:07,518 the minimum in this round. Now notice that, just like the formula 147 00:10:07,518 --> 00:10:10,452 that we used to populate the capital a array. 148 00:10:10,452 --> 00:10:14,104 The original array. To compute these BIVs, all we need to 149 00:10:14,104 --> 00:10:18,734 know is information from the last round. Information from the round i - 1. 150 00:10:18,734 --> 00:10:23,886 So just like with the a array, we can throw out all of the predecessor pointers 151 00:10:23,886 --> 00:10:27,669 that date back before yesterday. Before the previous round. 152 00:10:27,669 --> 00:10:31,712 Therefore, we again, need only constant space per destination v. 153 00:10:31,712 --> 00:10:36,277 To maintain both the shortest path distance from s to v with the most 154 00:10:36,277 --> 00:10:38,560 i-hops. And the predecessor pointer. 155 00:10:40,020 --> 00:10:42,914 So this is great. For input graphs that don't have negative 156 00:10:42,914 --> 00:10:46,790 cost cycles, we can not nearly compute the shortest paths in constant space per 157 00:10:46,790 --> 00:10:49,292 destination. We can also compute the predecessor 158 00:10:49,292 --> 00:10:53,119 pointers in that same space which allows the reconstruction of shortest paths. 159 00:10:53,119 --> 00:10:57,093 So one other thing you might want is to handle graphs that do have negative cost 160 00:10:57,093 --> 00:10:59,350 cycles. Now in a separate video we showed that 161 00:10:59,350 --> 00:11:03,275 just checking whether an input graph has a negative cost cycle or not is easily 162 00:11:03,275 --> 00:11:06,317 addressed by a simple extension of the Bellman-Ford Algorithm. 163 00:11:06,317 --> 00:11:08,966 You tack on one extra iteration of the outer for loop. 164 00:11:08,966 --> 00:11:12,253 You let I range not just up to n - 1 but all the way up to n. 165 00:11:12,253 --> 00:11:18,700 And if you see for some destination v, an improvement in the extra iteration when i 166 00:11:18,700 --> 00:11:23,460 = n, then that guarantees that there is a negative cost cycle, 167 00:11:23,460 --> 00:11:25,500 that is if and only if. But, in the same way you might want to 168 00:11:25,500 --> 00:11:30,000 actually have the shortest paths not really their value, you might want to 169 00:11:30,000 --> 00:11:34,320 actually have a negative cost cycle and not really know that one exists. 170 00:11:35,560 --> 00:11:39,409 It turns out you can solve the reconstruction problem for negative cost 171 00:11:39,409 --> 00:11:43,472 cycles as well, using the Bellman-Ford algorithm and predecessor pointers in 172 00:11:43,472 --> 00:11:46,091 exactly the way we've been maintaining them here. 173 00:11:46,091 --> 00:11:49,406 I'm not going to talk through all of the details of the solution. 174 00:11:49,406 --> 00:11:53,469 I'll leave it to you as a somewhat nontrivial exercise to think through how 175 00:11:53,469 --> 00:11:57,372 this really works, but the gist is as follows, so you run the Bellman-Ford 176 00:11:57,372 --> 00:12:01,435 algorithm, you maintain the predecessor pointers as on this slide, and if the 177 00:12:01,435 --> 00:12:05,498 input graph does have a negative cost cycle, then at some iteration, you will 178 00:12:05,498 --> 00:12:08,220 see a cycle amongst the predecessor pointers. 179 00:12:08,220 --> 00:12:13,265 Furthermore, that cycle of predecessor pointers must be a negative cost cycle of 180 00:12:13,265 --> 00:12:17,489 the original input graph. This means that detecting a negative cost 181 00:12:17,489 --> 00:12:22,534 cycle if one exists reduces to checking for a cycle in the predecessor pointers 182 00:12:22,534 --> 00:12:27,200 which of course you can solve using depth for a search at every iteration.