1 00:00:00,012 --> 00:00:06,814 Lets now turn to the analysis about a 3 step Greedy Heuristic for NAP set problem 2 00:00:06,814 --> 00:00:11,964 and show why it has a good worst case performance guarantee. 3 00:00:11,964 --> 00:00:18,816 So the goal is to prove that the value of the solution output by this 3 step greedy 4 00:00:18,816 --> 00:00:25,399 algorithm is always at least half the value of an optimal solution, a maximum 5 00:00:25,399 --> 00:00:30,365 value solution that respects the NAP set almost at capacity. 6 00:00:30,365 --> 00:00:35,975 The key idea behind the analysis, is a thought experiment, in which we use as a 7 00:00:35,975 --> 00:00:39,306 yard stick, an algorithm which gets to choose. 8 00:00:39,306 --> 00:00:44,654 Recall that in, [COUGH] step 1 of our greedy heuristic, we sort the items in 9 00:00:44,654 --> 00:00:50,081 non increasing order of bang for buck. That is, by the ratio of the value over 10 00:00:50,081 --> 00:00:53,147 the size. In step 2 of the greedy heuristic, we 11 00:00:53,147 --> 00:00:56,939 pack the items in this order. So maybe the first K items, for some 12 00:00:56,939 --> 00:00:59,735 value of K in this order, fit into the knapsack. 13 00:00:59,735 --> 00:01:04,050 But then we don't have room for item K plus 1, and at that point we stop this 14 00:01:04,050 --> 00:01:08,336 step of the heuristic. The thought experiment is to imagine that 15 00:01:08,336 --> 00:01:13,195 our algorithm gets to cheat and pack a fraction of this item k+1 into the 16 00:01:13,195 --> 00:01:17,901 knapsack to fill it up fully. So for example, if we pack in the first k 17 00:01:17,901 --> 00:01:22,467 items and after that, there's only 7 residual units of capacity in the 18 00:01:22,467 --> 00:01:25,547 knapsack. And suppose item k+1 has size 10, then we 19 00:01:25,547 --> 00:01:30,247 envision taking 70% of item k+1, and filling it, filling up the knapsack with 20 00:01:30,247 --> 00:01:33,252 that fraction. The value, lets say it's prorated. 21 00:01:33,252 --> 00:01:37,872 So lets say by, filling in 70% of the item into the knapsack, we get 70% of its 22 00:01:37,872 --> 00:01:41,734 value. So we're going to call the output of this 23 00:01:41,734 --> 00:01:45,914 cheating algorithm the greedy fractional solution. 24 00:01:45,914 --> 00:01:51,832 So just to give a super simple example image a knapsack of copacity 3 and you 25 00:01:51,832 --> 00:01:56,935 had 2 items both of size 2 you'd say 1 has value 3 and 1 has value 2. 26 00:01:56,935 --> 00:02:02,867 Well then in the greddy fractional solution you begin by considering item 1 27 00:02:02,867 --> 00:02:08,062 that has the higher ratio we have the fits fully in the napsack. 28 00:02:08,062 --> 00:02:12,319 Then you move on to item 2, it only fits 50% into the knapsack, and the total 29 00:02:12,319 --> 00:02:16,638 value of the solution would be the full value of item 1, that's 3, plus 50% of 30 00:02:16,638 --> 00:02:19,096 the value of item 2, so that adds 1, 1 more. 31 00:02:19,096 --> 00:02:22,025 So the value of the greedy solution here would be, 4. 32 00:02:22,025 --> 00:02:24,532 The greedy fractional solution, would be 4. 33 00:02:24,532 --> 00:02:30,764 [SOUND]. In the next quiz I'm going to ask you to 34 00:02:30,764 --> 00:02:41,232 ponder, what properties, this greedy fractional solution has, relative to 35 00:02:41,232 --> 00:02:46,297 feasible solutions Of the knapsack instance. 36 00:02:46,297 --> 00:02:52,657 The correct answer is D, the greedy fractional solution is always at least as 37 00:02:52,657 --> 00:02:58,997 good, as, the best non fractional solution, and in fact, it can be strictly 38 00:02:58,997 --> 00:03:03,002 better. Indeed, if you look at the last slide in 39 00:03:03,002 --> 00:03:09,432 that simple 2 item example, in that instance, the greedy fractional solution 40 00:03:09,432 --> 00:03:12,776 is Better, strictly better than every single feasible solution. 41 00:03:12,776 --> 00:03:16,287 Now, it's not going to be strictly better in every single instance, because there's 42 00:03:16,287 --> 00:03:19,625 going to be instances where the greedy fractional solution happens to use no 43 00:03:19,625 --> 00:03:23,210 fractions, it happens to just, fill the knapsack fully, using 100% of various 44 00:03:23,210 --> 00:03:25,421 items. And then that, there's no way that can be 45 00:03:25,421 --> 00:03:27,852 better than optimal, non fractional solutions. 46 00:03:27,852 --> 00:03:32,802 So, I'm not going to prove this fact for you in gory detail, but let me give you 47 00:03:32,802 --> 00:03:37,152 just an outline of the argument. So the claim, is that the greedy 48 00:03:37,152 --> 00:03:42,327 fractional solution, of a knapsack instance, is guaranteed to be at least as 49 00:03:42,327 --> 00:03:47,152 good, have at least as large a total value, as every feasible solution. 50 00:03:47,152 --> 00:03:51,210 That is, every non Fractional solution. The proof of this would fit in 51 00:03:51,210 --> 00:03:54,484 fantastically in the greedy algorithms section of this course. 52 00:03:54,484 --> 00:03:57,827 That's really what it is. One way you can prove this formally, is 53 00:03:57,827 --> 00:04:01,288 using an exchange argument. That's what I'll sketch at a high level 54 00:04:01,288 --> 00:04:03,523 here. I'll leave it as an exercise to you to 55 00:04:03,523 --> 00:04:07,619 fill in the details, but it's really very similar to our proof of optimality for 56 00:04:07,619 --> 00:04:11,690 the greedy algorithm, for why, sorting by ratios minimizes the weighted sum of 57 00:04:11,690 --> 00:04:14,352 completion times, of a bunch of jobs on a single. 58 00:04:14,352 --> 00:04:18,621 [INAUDIBLE] So how should we proceed,we have to prove that the greedy fractional 59 00:04:18,621 --> 00:04:22,850 solution is as good as every feasible solution, every non-fractional solution. 60 00:04:22,850 --> 00:04:26,822 So fix such a solution, call is S. This is a subset of items that Fit into 61 00:04:26,822 --> 00:04:29,642 the knapsack. So you could imagine, for example, a 62 00:04:29,642 --> 00:04:32,917 scenario in which the greedy fractional packs item number 1. 63 00:04:32,917 --> 00:04:36,002 And then, item number 2 doesn't fit, into the knapsack. 64 00:04:36,002 --> 00:04:40,462 But if we take 80% of item number 2, then that fully fills up the knapsack, so that 65 00:04:40,462 --> 00:04:44,797 might be the greedy fractional solution, and maybe we're thinking about non 66 00:04:44,797 --> 00:04:49,301 fractional solution where Item 1 is equally well packed there, but instead of 67 00:04:49,301 --> 00:04:53,744 item 2 which wouldn't fit, it uses item 4 which is small and then may be there is 68 00:04:53,744 --> 00:04:57,553 some little bit left over of unfilled [INAUDIBLE] capacity for this 69 00:04:57,553 --> 00:05:01,375 non-fractional solution S. So, the interesting case is where S 70 00:05:01,375 --> 00:05:05,271 packed some stuff that the greedy fractional solution does not, lets 71 00:05:05,271 --> 00:05:09,788 suppose it uses it uses l units of [INAUDIBLE] capacity via items which are 72 00:05:09,788 --> 00:05:12,692 not packed by the greedy fractional solution. 73 00:05:12,692 --> 00:05:18,397 Well, on the other hand, the greedy fractional solution completely fills up 74 00:05:18,397 --> 00:05:22,317 the knapsack. It uses all of the space, all W units. 75 00:05:22,317 --> 00:05:27,867 So, if there's l units of stuff in s, which are not in the greedy fractional 76 00:05:27,867 --> 00:05:33,757 solution, there have to equally well be at least, l units of stuff in the greedy 77 00:05:33,757 --> 00:05:37,572 fractional solution, which are not packed by it. 78 00:05:37,572 --> 00:05:40,141 S. So this might make it seem like the 2 79 00:05:40,141 --> 00:05:44,931 solutions are apples and oranges. Each of them packs some stuff that the 80 00:05:44,931 --> 00:05:48,857 other one doesn't. But they're not apples and oranges, the 81 00:05:48,857 --> 00:05:51,844 greedy fractional solution really is better. 82 00:05:51,844 --> 00:05:56,709 Why? Well, by the greedy criterion everything not packed by the greedy 83 00:05:56,709 --> 00:06:02,546 fractional solution, has ratio, has bang for buck Worse, than everything that it 84 00:06:02,546 --> 00:06:05,902 did pack. So these l units that are missing from 85 00:06:05,902 --> 00:06:11,150 the greedy fractional solution, are less valuable than the l units that it 86 00:06:11,150 --> 00:06:14,544 includes. So, since the l units of stuff in the 87 00:06:14,544 --> 00:06:20,142 greedy fractional solution, missing from S, are a better use of space, than the l 88 00:06:20,142 --> 00:06:24,885 units of stuff that are in s, but not the greedy fractional solution. 89 00:06:24,885 --> 00:06:30,486 Some, easy algebra shows that indeed, the overall value of the greedy fractional 90 00:06:30,486 --> 00:06:33,999 solution, is at least the overall value. Value of S. 91 00:06:33,999 --> 00:06:37,925 Since S was arbitrary, that shows that the greedy fractional solution is in some 92 00:06:37,925 --> 00:06:41,748 sense better than optimal, its better than all of the non-fractional feasible 93 00:06:41,748 --> 00:06:44,305 solutions. Now its not clear that women [INAUDIBLE] 94 00:06:44,305 --> 00:06:48,103 bothered to do this thought experiment which transpired in some fantasy world 95 00:06:48,103 --> 00:06:50,476 where we're allowed to pack items fractionally. 96 00:06:50,476 --> 00:06:54,240 Now what we really care about is the 3 step greedy algorithm and that is not in 97 00:06:54,240 --> 00:06:57,664 the fantasy world, that's in the real world where we cannot pack items 98 00:06:57,664 --> 00:07:00,009 fractionally. So, what good could this thought 99 00:07:00,009 --> 00:07:03,951 experiment [INAUDIBLE] be. Well, as we'll see in the analysis in the 100 00:07:03,951 --> 00:07:07,909 next slide, it provides a useful yardstick, a hypothetical benchmark 101 00:07:07,909 --> 00:07:11,557 against which we can compare the performance of our 3 step greedy 102 00:07:11,557 --> 00:07:12,128 algorithm.