Here is the analysis. Step 1 of our algorithm sorts the items in non-increasing order of value per size. Step 2 of our greedy heuristic picks the maximum prefix of items who fit into the kapsack. Let's say that maximum prefix comprises the first k items. The k plus first item is the first one that does not fit in the knapsack given the previous selections. So here is the cool part. Remember we added this step 3 to our greedy algorithm so that it considered two different candidate solutions and picked the better of them. The first candidate solution was just the outcome of step two. So therefore, we can definitely say the output of our three-step greedy algorithm is at least as good as the total value of these first k items chosen in step 2. The second candidate considered in step three of our greedy algorithm is the maximum value item all by itself. So one such item all by itself that would be considered by the greedy algorithm was this k plus first item, the item on which we got stuck in step 2. So therefore, whatever the value of our output of our three-step greedy algorithm is, it's definitely at least the value of item k plus one in isolation. So let's add these two inequalities together. On the left hand side, what do we get? We get double the value of our three-step greedy algorithm. On the right hand side, we get the first k items and also the item k plus one. That is, we get the total value of the first k plus one items. So now, we're in a position to connect this inequality to the fractional greedy solution. So remember, what is the greedy fractional solution, well, it packs the first k items, and then the item in which it gets stuck, namely, item k plus one, it fills up the knapsack fully with a suitable fraction of item k plus one, and the value that the algorithm gets is prorated according to the fraction that it fits in. But what we have here on the right-hand side in green is even better than the greedy fractional solution. This is the first k plus one items, 100% of them. The greedy fractional solution has the first k times, 100%, and some fraction of item k plus one. So, the value of the fractional greedy solution is even worse than the right-hand side here. And the whole point of our thought experiment was to argue that the greedy fractional solution is even better than optimal, that has value at least as much as every feasible knapsack solution. Divide both sides by two and you'll see that we proved the theorem. The output of the three-step greedy algorithm is guaranteed to be at least 50% that of an optimal solution. That's cool. We have a non-trivial worst-case performance guarantee for our simple and blazingly fast three-step greedy heuristic for the knapsack problem. But, maybe you were hoping to do a little bit better than within 50% of an optimal solution. Maybe in the back of your mind, you were really rooting for something like 90% or even better. There's a few roads we can take that might lead to better performance guarantees. The best case scenario is if we could just sharpen our analysis of this greedy heuristic, that is, don't change the algorithm, don't make any new assumptions, just have a better proof and maybe the 50% one prove to some other number. That would be the best case. The second thing we could do is if we really love this algorithm, for example, it being so fast, we could try to identify additional assumptions on instances that allow us to prove better performance guarantees than this 50% guarantee which holds for all instances. The third thing we're going to try to do is just come up with a better algorithm, that in fact, has better performance guarantees, so what I want to show you on this slide is that the first case, the best case scenario of just sharpening the analysis of this greedy algorithm, is not feasible The analysis cannot be sharpened. [INAUDIBLE] the 50% can be realized for certain instances. Then, we'll tackle the other 2 approaches, and we will in fact get better performance guarantees, either under extra assumptions about the instances, or alternatively, using a more sophisticated algorithm. So here's an example that shows that there really are knapsack instances on which this 3 step greedy algorithm might obtain merely 50% of the value of an optimal solution. In this example, we're going to set the knapsack capacity, capital W, to be 1000. There's going to be three items. Item one will have a value of 502 and a size of 501. The second and third items are going to be identical, both are going to have value 500, both are going to have size 500. So what is the greedy algorithm going to do? Well, in step 1, it sorts the items according to the value for, per size ratio, and you'll notice that the first item has a ratio slightly bigger than one, so it's going to be considered first before items two or three. In step 2 of the greedy algorithm, it packs item one in the knapsack, that leaves only 499 units of residual capacity, not enough room for either item two or item three. So step 2 of the greedy algorithm just outputs the first item by itself. Step three of the algorithm considers the max value item in isolation, that again is item one. For this reason, the greedy algorithm will output the solution which is just item one for a value of 502. And I'm sure you've noticed that there's a better solution if you [INAUDIBLE] item one and instead choose items two and three. They both fit in a knapsack, they fill it fully, for a combined value of 1000, which is essentially twice as big as the value of the greedy solution. So what does this example teach us? Well, it argues that if we want perfomance guarantees better than 50%, we have one of two options. First, if we really want to keep analyzing our three-step greedy algorithm, we're going to have to make extra assumptions about the instances. In order to prove performance guarantee is better than 50%. There are some instances out there where the performance is bad as 50% of optimal. The second approach we can take is to look at better algorithms which might have better guarantees on worst case instances. So in the next slide, I'll show you a refined analysis of the three-step greedy algorithm, conditions under which it performs much better than 50%. In a separate sequence of videos, we'll develop a dynamic programming base heuristic, which is guaranteed to have much better performance, on all instances. So there's a few different assumptions you can impose on knapsack instances that will enable you to prove better performace gurantees for greedy heuristics. Let me just show you one, that's both simple and also it's a condition that's met in many knapsack instances that arise in practice. So let's focus on knapsack instances where no item is especially big compared to the knapsack capacity. For concreteness, let's say that the size of every item is at most 10% of the knapsack capacity, capital W. As you'll see, there's nothing special about the number 10%. I'm just using that to keep the discussion concrete. So I claim that without changing our algorithm at all using exactly the same three-step greedy algorithm. Actually, here, we can even get away with the original two-step greedy algorithm. We can prove performance guarantee is much better than 50% as long as the instance satisfies this property. Why is this assumption useful? Well, lets think about step 2 of our greedy algorithm. So we've sorted the items in non-increasing order, of value per size. We're packing the items one at time. Now, at some point we get stuck, there's some item k plus one where if we tried to put it in the knapsack, we would overflow the capacity. Well, by assumption, this k plus one'th item, its size is at most 10% of the original knapsack capacity. So if sticking in item k plus one would cause it to overflow, that means there's less than 10% of the knapsack capacity currently available. That is, the knapsack is currently 90% full, or more with the items that we've already packed in so far. So this is sounding pretty good. Our greedy criterion ensures that whatever fraction of knapsack we wind up using, we're using in the most valuable possible way and we just noticed that this assumption on the instance, implies that we use almost all of the knapsack. To make this intutition precise, we comapre the output of our greedy algorith, even just step two, to our favorite hypothetical benchmark, mainly, the greedy fractional solution. What's the difference between these two solutions? Well, they're almost the same, the only difference is that the greedy fractional solution gets to pack that last bit of the knapsack, which we know is at least 10%, with a suitable fraction of item k plus one. So, the value that we're missing in our solution is that final 10% at most of the greedy fractional solution. Now in the greedy fractional solution, the items are packed in decreasing order of bag per buck and a value per size. So, this last 10% of the greedy fractional solution is also the least important. It's making the least valuable use of capacity. So this final at most 10% of the greedy fractional solution can account for, at most, 10% of its overall value. So whatever we're missing in our solution for our greedy algorithm, it's at most 10% of the overall value of the greedy fractional solution. That is, we're getting at least 90% of the value of the greedy fractional solution. And we know from our thought experiment that the greedy fractional solution is even better than optimal, is at least as good as any feasible knapsack solution, therefore, the output of our greedy algorithm, even if we don't use the third step, is 90% as good as the value of an optimal solution. All of this reasoning holds equally well with a number different than 10%, so for example if you only knew that every size was at most 20% of the knapsack, then you'd get that the greedy solution would be at least 80% of optimal. On the other hand if you knew that every item had size at the most 1% of the knapsack capacity then just the two-step greedy algorithm would get you within 99%.