1 00:00:00,012 --> 00:00:06,058 Let's turn to the analysis of our dynamic programming base heuristic for the 2 00:00:06,058 --> 00:00:10,911 knapsack problem. In particular, let's understand precisely 3 00:00:10,911 --> 00:00:17,472 how the running time of this heuristic depends on the desired accuracy epsilon. 4 00:00:17,472 --> 00:00:22,522 Let me jog your memory, what are the two steps of our dynamic programming base 5 00:00:22,522 --> 00:00:26,772 heuristic? The first thing we do is we massage the item values. 6 00:00:26,772 --> 00:00:30,322 We round them so that they are relatively small integers. 7 00:00:30,322 --> 00:00:35,347 Precisely, for each item i, we define v hat to be what you get by dividing the 8 00:00:35,347 --> 00:00:39,670 item size by m, and then rounding down to the nearest integer. 9 00:00:39,670 --> 00:00:43,270 In step 2, we invoke our second dynamic programming solution for the knapsack 10 00:00:43,270 --> 00:00:47,416 problem, and we pass to it these new values, the v hats, as well as the 11 00:00:47,416 --> 00:00:50,809 original item sizes and the original knapsack capacity. 12 00:00:50,809 --> 00:00:53,583 This algorithm as stated is underdetermined, 13 00:00:53,583 --> 00:00:57,208 that's because I haven't told you how to set the parameter m. 14 00:00:57,208 --> 00:01:01,964 We do at least understand qualitatively the role of m and how it influences the 15 00:01:01,964 --> 00:01:05,306 choice in the accuracy running time trade-off. 16 00:01:05,306 --> 00:01:10,615 Specifically, the bigger we set m, the more information we lose when we round to 17 00:01:10,615 --> 00:01:15,850 the original values the Vi's to the V hat i's. More loss in information should 18 00:01:15,850 --> 00:01:19,469 translate to less accuracy, larger m, less accuracy. 19 00:01:19,469 --> 00:01:24,408 On the other hand, the bigger we take m, the smaller the transformed item values, 20 00:01:24,408 --> 00:01:27,497 the v hats. Remember, the running time of our second 21 00:01:27,497 --> 00:01:31,993 dynamic programming algorithm is proportional to the largest item value, 22 00:01:31,993 --> 00:01:35,526 so smaller item values trans, translates to faster running time. 23 00:01:35,526 --> 00:01:38,188 Summarizing, the bigger the m, the less the accuracy, 24 00:01:38,188 --> 00:01:42,767 but the faster the algorithm. Here's our plan for the analysis. 25 00:01:42,767 --> 00:01:45,619 We begin by studying the accuracy constraint. 26 00:01:45,619 --> 00:01:50,693 Remember the set up, the client is giving us a positive parameter epsilon and it is 27 00:01:50,693 --> 00:01:55,506 our responsibility to output a feasible solution whose total value is at least 28 00:01:55,506 --> 00:01:58,257 one minus epsilon times that of an optimal solution. 29 00:01:58,257 --> 00:02:02,996 This constraint on our algorithm's accuracy should translate to an upper 30 00:02:02,996 --> 00:02:06,298 bound on how, how large a value of m we can get away with. 31 00:02:06,298 --> 00:02:09,068 Remember, as we jack up m, we lose accuracy. 32 00:02:09,068 --> 00:02:13,741 So the first question we're going to study is how large an m can we get away 33 00:02:13,741 --> 00:02:18,001 with while still being guaranteed to compute a solution within one minus 34 00:02:18,001 --> 00:02:21,953 epsilon times that of an optimal one. Once we solve this problem and we 35 00:02:21,953 --> 00:02:26,259 understand how large a value of m that we can get away with, how aggressive we're 36 00:02:26,259 --> 00:02:30,452 permitted to be with the scaling, we'll then evaluate the running time of 37 00:02:30,452 --> 00:02:33,802 the resulting algorithm for that particular choice of m. 38 00:02:33,802 --> 00:02:39,448 To answer this first question, how big can we take m subject to an accuracy 39 00:02:39,448 --> 00:02:44,428 constraint of one minus epsilon? It's important to understand in detail 40 00:02:44,428 --> 00:02:50,640 how the rounded item values, the V hats, compare to the original item values, the 41 00:02:50,640 --> 00:02:53,925 Vs. That's what this quiz is mean to ask you 42 00:02:53,925 --> 00:03:14,822 to think about. Remember how we obtain v hat i from Vi. 43 00:03:14,822 --> 00:03:29,944 First, we decrease it to the next multiple of m, then we divide it by m. 44 00:03:29,944 --> 00:03:38,943 So we can think of m times V hat of i, as well we have just after we've rounded 45 00:03:38,943 --> 00:03:43,003 down to the nearest multiple of m and before we actually divided by m. 46 00:03:43,003 --> 00:03:47,322 So how did we get to this m times V hat i? We decreased Vi down to the next 47 00:03:47,322 --> 00:03:50,629 nearest multiple. So we decreased it by somewhre between 48 00:03:50,629 --> 00:03:53,651 zero and m, so that result is somewhere between Vi 49 00:03:53,651 --> 00:03:57,579 minus m and Vi. So let's now assemble a few preliminaries 50 00:03:57,579 --> 00:04:03,067 crucial for the accuracy analysis. So, what did we learn from the quiz? We 51 00:04:03,067 --> 00:04:07,800 learned that for each item, i, the rounded value V hat i scaled up by m, 52 00:04:07,800 --> 00:04:12,341 so intuitively, this is after we've rounded Vi down to the nearest multiple, 53 00:04:12,341 --> 00:04:16,907 but before we've divided, that lies somewhere between Vi as an upper bound 54 00:04:16,907 --> 00:04:20,886 and Vi-m as a lower bound. So let me extract from this two different 55 00:04:20,886 --> 00:04:25,287 inequalities, one which expresses an upper bound on Vi in terms of V hat i, 56 00:04:25,287 --> 00:04:29,232 and then one, which expresses a lower amount on Vi in terms of V hat i. 57 00:04:29,232 --> 00:04:32,924 So there are two inequalities which are immediate from this fact. 58 00:04:32,924 --> 00:04:35,489 Let me go ahead and give them names 1 and 2. 59 00:04:35,489 --> 00:04:40,350 We'll put these to use in the next slide. So first of all, the value Vi, well, 60 00:04:40,350 --> 00:04:45,284 that's lower bounded by m times the rounded value V hat i, and then, V hat i, 61 00:04:45,284 --> 00:04:48,953 we can lower bound that by Vi after we subtract m from it. 62 00:04:48,953 --> 00:04:53,517 So we'll call those inequalities 1 and 2. So these first two inequalities, they're 63 00:04:53,517 --> 00:04:56,488 important for us. They don't have a lot of content, they're 64 00:04:56,488 --> 00:04:59,891 just sort of immediate from our definition of step 1 of our algorithm 65 00:04:59,891 --> 00:05:03,581 from the way we do our rounding. But step 2 of our algorithm gives us a third 66 00:05:03,581 --> 00:05:08,064 inequality, which is much more powerful, and this says, well look, in the second 67 00:05:08,064 --> 00:05:11,292 step, we solve a knapsack instance optimally. 68 00:05:11,292 --> 00:05:16,054 It's true we solve it for the wrong values, we solve it for the V hats, 69 00:05:16,054 --> 00:05:21,124 but, for these rounded values, the V hats, the solution we come up with is 70 00:05:21,124 --> 00:05:25,530 better than any other. In particular, if we measure value using 71 00:05:25,530 --> 00:05:30,953 the rounded values, the V hats, then the solution S output by our heuristic is 72 00:05:30,953 --> 00:05:36,385 even better than the solution S star optimal for the original problem, optimal 73 00:05:36,385 --> 00:05:39,634 for the Vi's. That is, if we sum up the V hat i's of 74 00:05:39,634 --> 00:05:44,600 the items in our solution S, and we compare it to the sum of the V hat i's in 75 00:05:44,600 --> 00:05:50,032 any other solution, in particular, the solution S star optimal for the original 76 00:05:50,032 --> 00:05:54,227 instance, we come out ahead. Our sum of values is bigger. 77 00:05:54,227 --> 00:05:56,972 That's going to be our inequality number 3. 78 00:05:56,972 --> 00:06:02,037 These 3 inequalities are pretty much all we've got going for us, but fortunately 79 00:06:02,037 --> 00:06:05,532 they are sufficient to solve the problem that we asked. 80 00:06:05,532 --> 00:06:10,342 To determine how large an m we can get away with, subjects to guaranteeing an 81 00:06:10,342 --> 00:06:15,321 accuracy of one minus epsilon. To see this, let's just follow our nose. 82 00:06:15,321 --> 00:06:20,119 Let's just see what happens if we chain these three inequalities together. 83 00:06:20,119 --> 00:06:23,234 Now, the third one seems to have the most content, 84 00:06:23,234 --> 00:06:28,362 that really say's were optimally solving this problem with the transform values, 85 00:06:28,362 --> 00:06:32,672 the V hats. So let's by just writing down, yet again, 86 00:06:32,672 --> 00:06:35,442 inequality number three. So, inequality three just says that the 87 00:06:35,442 --> 00:06:39,622 sum of the transformed values, the sum of the V hats and r heuristic solution 88 00:06:39,622 --> 00:06:42,442 capital S is at least as good as any other solution, 89 00:06:42,442 --> 00:06:46,862 in particular, that of the solution S star optimal for the original problem for 90 00:06:46,862 --> 00:06:50,162 the original item values. So inequality three is a fine starting 91 00:06:50,162 --> 00:06:52,642 point. Fundamentally, what we're trying to do 92 00:06:52,642 --> 00:06:56,292 here is lower bound the performance of the solution S, spit out by our 93 00:06:56,292 --> 00:06:58,829 heuristic, and that's what this inequality does. 94 00:06:58,829 --> 00:07:02,738 The one problem is that it's justifying the performance of S in terms of the 95 00:07:02,738 --> 00:07:06,533 wrong data, in terms of the transformed values that V hats, whereas what we 96 00:07:06,533 --> 00:07:09,202 really care about are the original values of the Vs. 97 00:07:09,202 --> 00:07:11,652 So we need to take this inequality 3 as a seed. 98 00:07:11,652 --> 00:07:15,695 And grow from it a chain of inequalities, rewriting on both the left-hand side and 99 00:07:15,695 --> 00:07:19,148 the right-hand side, the transformed values the V hats in terms of the 100 00:07:19,148 --> 00:07:22,759 original values, the V's and we'll use inequalities one and two to do that. 101 00:07:22,759 --> 00:07:26,748 As we grow the chain to the left, we want quantities that are only getting bigger 102 00:07:26,748 --> 00:07:29,286 and bigger. As we grow the chain to the right, we 103 00:07:29,286 --> 00:07:32,224 want quantities that are only going smaller and smaller. 104 00:07:32,224 --> 00:07:35,760 Hopefully, at the end of the day, when the dust settles, we'll have an 105 00:07:35,760 --> 00:07:38,841 approximate optimaility result for our heuristic solution S. 106 00:07:38,841 --> 00:07:42,956 Now if you look back inequalities one and two, you'll see that both are in terms of 107 00:07:42,956 --> 00:07:45,913 the quantity m times V hat. So, it's going to be convenient to 108 00:07:45,913 --> 00:07:48,154 multiply both sides of the inequanity by m. 109 00:07:48,154 --> 00:07:53,013 Of course, it's still valid if I do that. To extend this chain of inequalities on 110 00:07:53,013 --> 00:07:57,776 the left-hand side, we need to tack on a quantity which is only bigger. That is, 111 00:07:57,776 --> 00:08:02,355 we need to upper bound m times V hat i in terms of the original value Vi, but 112 00:08:02,355 --> 00:08:07,043 inequality one simply says that indeed, m times V hat i is never bigger than the 113 00:08:07,043 --> 00:08:10,630 original value Vi. So we simply apply inequality one to each 114 00:08:10,630 --> 00:08:15,634 item in the heuristic solution capital S, so we get an upper bound of sum over the 115 00:08:15,634 --> 00:08:18,694 items in our solution S of the value of that item. 116 00:08:18,694 --> 00:08:23,673 So that's great, we get a lower bound on the performance of our solution, that's 117 00:08:23,673 --> 00:08:27,535 exactly what we want. To grow the chain of inequalities to the 118 00:08:27,535 --> 00:08:30,370 right, we need a quantity which is only smaller. 119 00:08:30,370 --> 00:08:34,418 That is we want to lower bound m times V hat i in terms of some function of the 120 00:08:34,418 --> 00:08:37,673 original value V sub i. Now, V sub i can be bigger than m times V 121 00:08:37,673 --> 00:08:39,932 hat i. Remember, we round it down, but 122 00:08:39,932 --> 00:08:43,132 inequality two says it can only be bigger by at most m. 123 00:08:43,132 --> 00:08:47,712 So if we subtract m from Vi, we get a lower down on m times V hat i, 124 00:08:47,712 --> 00:08:52,222 so we just apply that term by term, to the optimal solution S star. 125 00:08:52,222 --> 00:08:55,953 So just by following our nose on this way, we get exactly what we want. 126 00:08:55,953 --> 00:08:59,969 We get on the left-hand side, what we really care about, namely the total value 127 00:08:59,969 --> 00:09:04,042 of the solution, output by our heuristic and this is the total value with the 128 00:09:04,042 --> 00:09:07,527 original values, mind you. Not with the transformed ones, but the 129 00:09:07,527 --> 00:09:11,062 ones we really care about. We get a lower bound on that total value, 130 00:09:11,062 --> 00:09:15,759 that's exactly what we wanted. In terms of the best case scenario, the 131 00:09:15,759 --> 00:09:21,581 total value of the optimal solution, and the only blemish as we pick up this error 132 00:09:21,581 --> 00:09:25,590 of at most minus m for each item i in the optimal solution S star. 133 00:09:25,590 --> 00:09:30,992 So lets just, lower bound this crudely, the optimal solution S star, it can only 134 00:09:30,992 --> 00:09:35,204 have n items at most, so we pick up a minus m at worst for each 135 00:09:35,204 --> 00:09:39,075 of those at most n items. So, if we subtract an error term of m 136 00:09:39,075 --> 00:09:43,250 times n, that gives us lower bound on the value of R solution S. 137 00:09:43,250 --> 00:09:48,261 So now that the dust is cleared and we see that we have a performance guarantee 138 00:09:48,261 --> 00:09:53,196 for R solution S in terms of that, of the optimal solution S star, basically, we're 139 00:09:53,196 --> 00:09:57,852 off by this error term of m times n. Remember, m is this parameter that 140 00:09:57,852 --> 00:10:02,474 governs how aggressively we scale the item values and it's controlling the 141 00:10:02,474 --> 00:10:05,867 running time accuracy trade-off. We were expecting this all along. We 142 00:10:05,867 --> 00:10:09,491 argued that the bigger you take m, the more information you lose, so the less 143 00:10:09,491 --> 00:10:12,418 accuracy you'd expect. And indeed, the extent that we're off by 144 00:10:12,418 --> 00:10:16,471 the optimal solution is growing linearly in this parameter m. 145 00:10:16,471 --> 00:10:21,632 The question we're striving to answer, remember, is how big an m can we get away 146 00:10:21,632 --> 00:10:26,274 with and still be sure our solution is within a one minus epsilon factor of the 147 00:10:26,274 --> 00:10:29,855 optimal one? But to achieve this, we just need to make 148 00:10:29,855 --> 00:10:34,398 sure the worst case error in our solution, m*n, is never bigger than an 149 00:10:34,398 --> 00:10:37,780 epsilon fraction of the optimal solution's value. 150 00:10:37,780 --> 00:10:43,217 That is, we just set m small enough that m*n is not more than epsilon times the 151 00:10:43,217 --> 00:10:46,352 optimal value. So this inequality is meant to tell us 152 00:10:46,352 --> 00:10:50,717 how to set the parameter m in our algorithm, but you'd be right to complain 153 00:10:50,717 --> 00:10:55,432 that on the right-hand side, there's a quality here that we don't actually know. 154 00:10:55,432 --> 00:11:00,402 Right? We don't actually know the optimal value of the optimal solution S star, 155 00:11:00,402 --> 00:11:05,207 that after all, is what we're trying to compute, at least approximately in the 156 00:11:05,207 --> 00:11:08,107 first place. So instead, what we're going to do is 157 00:11:08,107 --> 00:11:12,667 just use a crude lower bound on how small the optimal solution value could be. 158 00:11:12,667 --> 00:11:16,942 Let's make a trivial assumption. Let's assume that each item has a size, 159 00:11:16,942 --> 00:11:21,563 at most, the knapsack capacity. That is, each item fits by itself inside 160 00:11:21,563 --> 00:11:25,309 the knapsack. Any items which fail that test obviously 161 00:11:25,309 --> 00:11:30,700 can be deleted in a preprocessing step, then the optimal solution, if nothing 162 00:11:30,700 --> 00:11:34,982 else, it's at least as large as the value of the max value item. 163 00:11:34,982 --> 00:11:40,043 So to satisfy the desired accuracy constraint, all we need to do is to set m 164 00:11:40,043 --> 00:11:44,648 small enough, so that m*n is at most the right-hand side listed here. 165 00:11:44,648 --> 00:11:49,936 Of course, it's sufficient to set m even smaller so that m*n is no more than an 166 00:11:49,936 --> 00:11:54,270 even smaller number. So, in our algorithm we just set m to be 167 00:11:54,270 --> 00:12:00,115 the number that equalizes m*n, the left-hand side, with our lower bound on 168 00:12:00,115 --> 00:12:04,194 the right-hands side, namely epsilon time Vmax. 169 00:12:04,194 --> 00:12:07,679 So notice, these are all indeed parameters known to the algorithm. 170 00:12:07,679 --> 00:12:11,581 It of course knows how many items n there are, it knows epsilon, that's the 171 00:12:11,581 --> 00:12:15,604 parameter supplied to the algorithm by the client, and it's easy to compute 172 00:12:15,604 --> 00:12:19,382 Vmax, the maximum value item. So the algorithm is in a position to set 173 00:12:19,382 --> 00:12:24,332 m so that this equality holds, that is to set m equals to epsilon Vmax over n. 174 00:12:24,332 --> 00:12:29,657 That finally, completes the description of the dynamic, dynamic programming based 175 00:12:29,657 --> 00:12:33,132 heuristic. So we've now completed the answer to the 176 00:12:33,132 --> 00:12:38,257 first question that we asked, how large an m can we get a way with subject to the 177 00:12:38,257 --> 00:12:43,082 accuracy constrain of one minus epsilon? We can get away with m as large as 178 00:12:43,082 --> 00:12:46,109 epsilon times the largest item value divided by n? 179 00:12:46,109 --> 00:12:51,082 And now, I hope that everybody is holding their breath and crossing their fingers 180 00:12:51,082 --> 00:12:55,647 just a little bit, because remember, n not only governs the accuracy, it also 181 00:12:55,647 --> 00:12:59,742 governs the running time. The more aggressively that we can scale 182 00:12:59,742 --> 00:13:04,009 the numbers, the larger we can take m, the faster the algorithm. 183 00:13:04,009 --> 00:13:09,024 The question now is, are we permitted to take m large enough that the resulting 184 00:13:09,024 --> 00:13:13,822 heuristic running time is polynomial? The running time of the heuristic is 185 00:13:13,822 --> 00:13:19,046 certainly dominated by step 2, where we invoke our dynamic programming algorithm. 186 00:13:19,046 --> 00:13:24,036 The running time there is n squared times the maximum item value passed to that 187 00:13:24,036 --> 00:13:27,402 subroutine. That is, the maximum value of a scaled 188 00:13:27,402 --> 00:13:32,279 value, the maximum V hat i. So the big question then is how big can 189 00:13:32,279 --> 00:13:38,264 the scaled item values, the V hat i's be? So pick your favorite item i, how big can 190 00:13:38,264 --> 00:13:43,661 it's transformed value V hat i be? Well, how do we get V hat i? We take the 191 00:13:43,661 --> 00:13:47,972 original value vi, we round it down, we divide it by m. 192 00:13:47,972 --> 00:13:54,258 So, the biggest certainly that V hat i could be is the original value of Vi 193 00:13:54,258 --> 00:13:57,843 divided by m. That obviously is at most the maximum 194 00:13:57,843 --> 00:14:04,714 orig, original value divided by m. Now, we plug in our selection for m 195 00:14:04,714 --> 00:14:10,142 epsilon times Vmax over n. Very happily, the two Vmax's cancel, 196 00:14:10,142 --> 00:14:15,877 leaving us with just n over epsilon. Plugging in n over epsilon as an upper 197 00:14:15,877 --> 00:14:22,191 bound on every transformed value passed to the step 2 dynamic programming 198 00:14:22,191 --> 00:14:27,222 solution, we get an overall running time of n^3 over epsilon. 199 00:14:27,222 --> 00:14:29,753 The running time does degrade with epsilon, 200 00:14:29,753 --> 00:14:32,775 the smaller the epsilon, the bigger the running time. 201 00:14:32,775 --> 00:14:36,376 Of course, you would expect that with any NP-complete problem. 202 00:14:36,376 --> 00:14:40,934 As you take epsilon smaller and smaller, you need to take m smaller and smaller to 203 00:14:40,934 --> 00:14:45,354 get the accuracy guarantee, that results in less agressive scaling of the item 204 00:14:45,354 --> 00:14:49,962 values, in turn, bigger transform values get passed to the dynamic programming 205 00:14:49,962 --> 00:14:52,767 subroutine resulting in the bigger running time. 206 00:14:52,767 --> 00:14:57,052 Nevertheless, this does show that the full continuum of running time accuracy 207 00:14:57,052 --> 00:14:59,737 trade-offs is possible for the knapsack problem. 208 00:14:59,737 --> 00:15:03,830 You want arbitrarily close approximation for this problem? You've got it.