1 00:00:00,000 --> 00:00:04,282 So the time has arrived for us to finish the proof of the fact that this 2 00:00:04,282 --> 00:00:08,916 deterministic algorithm based on the median of median ideas, does indeed run in 3 00:00:08,916 --> 00:00:12,638 linear time. We've done really all the [inaudible] difficult work. We've 4 00:00:12,638 --> 00:00:15,801 discussed the algorithmic ingenuity required. To choose a pivot 5 00:00:15,801 --> 00:00:19,776 deterministically that's guaranteed to be pretty good. So remember the idea was you 6 00:00:19,776 --> 00:00:23,511 take the input array, you logically break it into groups of five, you sort each 7 00:00:23,511 --> 00:00:27,438 group. That's like the first round of a two round knockout tournament. The winners 8 00:00:27,438 --> 00:00:31,461 of the first round are the middle elements of each group of five. That's the initial 9 00:00:31,461 --> 00:00:35,292 set of medians. And then the second around we take a median of these N over five 10 00:00:35,292 --> 00:00:39,027 first round winners, and that's what we return as the pivot. And we proved this 11 00:00:39,027 --> 00:00:43,001 key lemma which is the 30/70 lemma, which says that if you choose the pivot by this 12 00:00:43,001 --> 00:00:46,832 two round knockout tournament, you're guaranteed to get a 30/70 split or better. 13 00:00:46,832 --> 00:00:50,972 So your recursive call in line six or seven. Of having a de-select is guaranteed 14 00:00:50,972 --> 00:00:54,980 to be on an array that has at most 70 percent of the elements that you started 15 00:00:54,980 --> 00:00:59,045 with. In other words you're guaranteed to prune at least 30 percent of the array 16 00:00:59,045 --> 00:01:03,654 before you recurs again. But what remains to understand is whether or not we've done 17 00:01:03,654 --> 00:01:07,994 a sensible trade off. Have we kept the work required to compute this 30/70 split 18 00:01:07,994 --> 00:01:12,137 small enough. That we get the desired linear running time. Or have we, is the 19 00:01:12,137 --> 00:01:16,019 cost of finding a pretty good pivot outweighing the benefit of having 20 00:01:16,019 --> 00:01:20,122 guaranteed good splits? That's what we gotta prove. That's the next subject. 21 00:01:20,122 --> 00:01:24,503 Here's the story so far. You'll recall that, as usual, we define T of N to be the 22 00:01:24,503 --> 00:01:28,994 worst case running time of an algorithm. In this case, D select on inputs of array 23 00:01:28,994 --> 00:01:33,319 length. I didn't put arrays of length N. And we discussed, okay, there's the base 24 00:01:33,319 --> 00:01:37,422 case as usual. But in the general case, we discussed how, outside of the two 25 00:01:37,422 --> 00:01:41,290 recursive calls. The deselect algorithm, there's a linear number of operations. 26 00:01:41,290 --> 00:01:45,346 What does it have to do? It has to do the sorting, but each sorting is on a group of 27 00:01:45,346 --> 00:01:48,962 sized constants, size five, so it takes constant time for a group. There's a 28 00:01:48,962 --> 00:01:52,822 linear number of groups, so step one takes linear time, the copying takes linear 29 00:01:52,822 --> 00:01:56,584 time, and the partitioning takes linear time. So, there's some constant C, which 30 00:01:56,584 --> 00:02:00,298 is gonna be bigger than one, but it's gonna be constant. So, then outside of a 31 00:02:00,298 --> 00:02:04,162 recursive cause. Deselect always does at most C times N operations. Now what's up 32 00:02:04,162 --> 00:02:07,843 with the recursive calls. Well, remember there's two of them. First, there's one on 33 00:02:07,843 --> 00:02:11,339 line three that's just responsible for helping choose the pivot. This one we 34 00:02:11,339 --> 00:02:14,974 understand. It's always on twenty percent of the imputed rate of like the first 35 00:02:14,974 --> 00:02:18,839 round winners, so we can very safely write T of N over five for the work done, in the 36 00:02:18,839 --> 00:02:22,611 worst case, by that first recursive call. What we didn't understand until we proved 37 00:02:22,611 --> 00:02:26,108 the key lemma was what's up with the second recursive call, which happens on 38 00:02:26,108 --> 00:02:29,927 either line six or line seven. The size of the imputed rate on which we recursed 39 00:02:29,927 --> 00:02:33,653 depends on the quality of the pivot, and it was only when we proved the key lemma 40 00:02:33,653 --> 00:02:37,420 that we had a guarantee on the. [inaudible] 30-70 split or better what 41 00:02:37,420 --> 00:02:42,220 does that mean? That means the largest sub-array we could possibly recurs on has 42 00:02:42,220 --> 00:02:46,692 seven-tenths N elements. So what remains is to find the solution for this 43 00:02:46,692 --> 00:02:53,522 recurrence and hopefully prove that it is indeed big O event. So I'm going to go 44 00:02:53,522 --> 00:02:58,608 ahead and rewrite the occurrence at the to of the slide. We're not really going to 45 00:02:58,608 --> 00:03:03,929 worry about the T to one equal one. What we're interested in is the fact that the 46 00:03:03,929 --> 00:03:09,405 running time on an input of length N is at most C times N. Where again c is some 47 00:03:09,405 --> 00:03:14,175 constant which is gonna have to be at least one, given all the work that we do 48 00:03:14,175 --> 00:03:19,129 outside of the recursive calls. Plus the recursive call on line three on an array 49 00:03:19,129 --> 00:03:23,655 of size n over five. Plus the second recursive call, which is on some array 50 00:03:23,655 --> 00:03:27,663 that has size in the worst case seven-tenths n. So that's cool. This is 51 00:03:27,663 --> 00:03:30,817 exactly how we handle the over deterministic divide and conquer 52 00:03:30,817 --> 00:03:34,611 algorithms that we studied in earlier videos. We just wrote down a recurrence 53 00:03:34,611 --> 00:03:38,735 and then we solve the recurrence, but now, here's the trick. And all of the other 54 00:03:38,735 --> 00:03:42,832 recurrences that came up. For example, Merge Short, Strassner's Matrix 55 00:03:42,832 --> 00:03:47,471 Multiplication Algorithm, [inaudible] multiplication, you name it. We just plug 56 00:03:47,471 --> 00:03:52,170 the parameters into the masters method. And because of the power of the master 57 00:03:52,170 --> 00:03:57,111 method, boom! Out popped up an answer. It just told us what the recurrence evaluated 58 00:03:57,111 --> 00:04:01,690 to. Now, the master method, as powerful as it is, it did have an assumption, you 59 00:04:01,690 --> 00:04:06,208 might recall. The assumption was that every sub-problem solved had the same 60 00:04:06,208 --> 00:04:10,674 size. And that assumption is violated by this linear time selection algorithm. 61 00:04:10,674 --> 00:04:14,897 There are two recursive calls. One of 'ems on twenty percent of the original array. 62 00:04:14,897 --> 00:04:19,120 The other one is probably on much more than twenty percent of the original array. 63 00:04:19,120 --> 00:04:23,227 It could be as much as 70 percent of the original array. So because we have two 64 00:04:23,227 --> 00:04:27,738 recursive calls, and sub problems of different size, this does not fit into the 65 00:04:27,738 --> 00:04:32,540 situations that the master method covers. It's a very rare algorithm in that regard. 66 00:04:34,540 --> 00:04:38,340 Now, there are more general versions of the master method, of the master theorem 67 00:04:38,340 --> 00:04:41,948 which can accommodate a wider class of recurrences including this one here. 68 00:04:42,092 --> 00:04:46,181 Alternatively we could push the recursion tree proof so that we could get a solution 69 00:04:46,181 --> 00:04:49,885 for this recurrence. Some of you might want to try that at home. But I want to 70 00:04:49,885 --> 00:04:53,829 highlight a different way you can solve recurrences just for variety, just to give 71 00:04:53,829 --> 00:04:58,186 you yet another tool. Now the good news of the, about this approach that I'm gonna 72 00:04:58,186 --> 00:05:01,927 show you is that it's very flexible. It can be used to solve sort of arbitrarily 73 00:05:01,927 --> 00:05:05,762 crazy recurrences. It's certainly going to be powerful enough to evaluate this one. 74 00:05:05,762 --> 00:05:09,737 The bad news is that it's very out of hock. It's not very necessarily very easy 75 00:05:09,737 --> 00:05:13,572 to use. It's kind of a dark art figuring out how to apply it. So it's often just 76 00:05:13,572 --> 00:05:17,453 guess and check, is what it's called. You guess what the answer to the recurrence is 77 00:05:17,453 --> 00:05:21,335 and then you verify it by induction. Here, because we have such a specific target in 78 00:05:21,335 --> 00:05:25,170 mind, the whole point of this exercise is to prove a linear is not bound, I'm gonna 79 00:05:25,170 --> 00:05:29,254 call it just hope and check. So we're gonna hope there's linear of time and then 80 00:05:29,254 --> 00:05:33,896 we're gonna try to produce a proof of that just that verifies the linear time bound 81 00:05:33,896 --> 00:05:38,641 using induction. Specifically what are we hoping for, we're crossing our fingers 82 00:05:38,641 --> 00:05:43,521 that there's a constant, I'm going to call it A, A can be big but it's got to be 83 00:05:43,521 --> 00:05:48,340 constant, and again remember constant means it does not depend on N in any way. 84 00:05:48,600 --> 00:05:53,078 Such that our recurrence at the top of this slide T-N is bound above by A times N 85 00:05:53,078 --> 00:05:57,611 for all and at least one. Why is this what we're hoping? Well suppose this were true. 86 00:05:57,611 --> 00:06:02,144 By definition T of N is a upper bound of the running time of our algorithm. So it's 87 00:06:02,144 --> 00:06:06,623 bound and [inaudible] by A times N then it's obviously an O event. It's obviously 88 00:06:06,623 --> 00:06:10,828 a linear time algorithm. It's obviously A gets that gets suppressed in the big 89 00:06:10,828 --> 00:06:16,727 rotation. So that's the hope, now let's check it. And again, check mean just 90 00:06:16,727 --> 00:06:22,190 verify by induction on N. So the precise claim that I'm going to prove is the 91 00:06:22,190 --> 00:06:26,745 following. I'm gonna go ahead and choose the constant A. Remember all we need is 92 00:06:26,745 --> 00:06:31,407 some constant A, no matter how big as long as it's independent of N. That'll give us 93 00:06:31,407 --> 00:06:35,676 the big O of N time. So I'm actually gonna tell you what A I'm gonna use for 94 00:06:35,676 --> 00:06:40,170 convenience. I'm gonna choose A to be 10C. Now what is C? C is just a constant that 95 00:06:40,170 --> 00:06:44,661 we inherit from the recurrence that we're given. Now remember what this recurrence 96 00:06:44,661 --> 00:06:49,262 means is this is what the running time is of the deselect algorithm and the C times 97 00:06:49,262 --> 00:06:53,534 N represents the work that's outside of the recursive calls. So this is just a 98 00:06:53,534 --> 00:06:58,025 constant multiple on the amount of linear work that deselect does for sorting the 99 00:06:58,025 --> 00:07:02,626 groups, for doing the partitioning and for doing the copying. And so there's gonna be 100 00:07:02,626 --> 00:07:07,172 some small task at a reasonable cost and, and for the proof I'm just gonna multiply 101 00:07:07,172 --> 00:07:12,521 that by ten and use that as my A. And the claim is if I define A in that way then 102 00:07:12,521 --> 00:07:18,129 indeed, it is true that for all and at least one, T of N is banded above by A 103 00:07:18,129 --> 00:07:22,588 times N. Now, I realized I just, I pulled this constant A out of nowhere, right? Y10 104 00:07:22,588 --> 00:07:25,885 times C. Well, if you recall our discussion when we proved that things were 105 00:07:25,885 --> 00:07:29,227 big O of something else, there again, there was some constant. So to formally 106 00:07:29,227 --> 00:07:32,881 prove that something is big O of something else, you have to say what the constant 107 00:07:32,881 --> 00:07:36,357 is. And in the proof, you always wonder how do you know what constant to use? So, 108 00:07:36,357 --> 00:07:39,922 in practice, when you're actually, if you have to actually do one of these proofs 109 00:07:39,922 --> 00:07:43,398 yourself, you reverse engineer what kind of constant would work. So you just go 110 00:07:43,398 --> 00:07:46,695 through the argument with a generic constant. And then you're like, oh, well, 111 00:07:46,695 --> 00:07:50,171 if I set the constant to be this, I can complete the proof. So we'll see, that's 112 00:07:50,171 --> 00:07:53,915 exactly what's gonna happen in the proof of this claim. It'll be obvious. The very 113 00:07:53,915 --> 00:07:58,791 last line you'll see why it shows A equals 10C. So I just reverse engineered what I 114 00:07:58,791 --> 00:08:02,887 needed for the proof. But to keep the proof easy to follow line by line I 115 00:08:02,887 --> 00:08:07,267 decided to just full disclosure tell you the cost right at the beginning. Now no 116 00:08:07,267 --> 00:08:11,320 prizes for guessing that the way this proof proceeds is by induction on N. 117 00:08:11,320 --> 00:08:14,872 Induction's the obvious thing to use, we're trying to prove an assertion for 118 00:08:14,872 --> 00:08:18,517 every single positive number N and moreover we're given this recurrence which 119 00:08:18,517 --> 00:08:22,490 relates solutions of smaller sub-problems to that of bigger problems. So that sets 120 00:08:22,490 --> 00:08:26,035 things up for use of the inductive hypothesis. If you want a longer review of 121 00:08:26,035 --> 00:08:29,727 what proofs by induction are, I suggest that you go back and re-watch the optional 122 00:08:29,727 --> 00:08:33,194 video where we prove the correctness of quicksort. That is, is a fairly formal 123 00:08:33,194 --> 00:08:36,751 discussion of what the template is like for a proof by induction. And that's the 124 00:08:36,751 --> 00:08:40,352 one we're gonna apply here. So, there's two ingredients in any proof by induction 125 00:08:40,352 --> 00:08:43,774 is, is a usually trivial one in the form of a base case. That's also gonna be 126 00:08:43,774 --> 00:08:48,330 trivial here. In the base case you just directly establish the assertion when n 127 00:08:48,330 --> 00:08:53,263 equals one. So, we're trying to prove that t of n is the most a times n for every n 128 00:08:53,263 --> 00:08:57,715 when n equals one we could just substitute. But what we're trying to prove 129 00:08:57,715 --> 00:09:02,588 is that t of one is at most a time one also known as a. And we're given that t of 130 00:09:02,588 --> 00:09:06,850 one is one. Right that's the base case of the recurrence that we're given. So that's 131 00:09:06,850 --> 00:09:11,544 what we're using here. What we want to be true is that this isn't the most a times 132 00:09:11,544 --> 00:09:16,159 one, but it is. So the constant c we're assuming is at least one, so it certainly 133 00:09:16,159 --> 00:09:20,890 can multiply c times ten to get a. It's definitely at least one. So the right hand 134 00:09:20,890 --> 00:09:25,505 side here is unquestionably bigger than the left hand side. A in fact is bigger 135 00:09:25,505 --> 00:09:29,940 than ten, let alone bigger than ten. So the interesting ingredient is generally 136 00:09:29,940 --> 00:09:34,386 the inductive step so remember what you do is here is you assume you've already 137 00:09:34,386 --> 00:09:38,833 proven the assertion that, in this case the T of N is at most AN for all smaller 138 00:09:38,833 --> 00:09:43,423 integers, and now you just merely have to prove it again for the current integer. So 139 00:09:43,423 --> 00:09:47,449 we're now interested in the case where N is bigger than one and the assumption that 140 00:09:47,449 --> 00:09:51,569 we've already [sound] proven to everything smaller is called inductive hypotheses. So 141 00:09:51,569 --> 00:09:56,756 what does it mean that we already proved it for all smaller numbers, that means we 142 00:09:56,756 --> 00:10:02,196 can use in the proof of our inductive step the fact that P of K is the most a times K 143 00:10:02,196 --> 00:10:07,130 for all K strictly less than N. All we gotta do is enlarge the range of N's to 144 00:10:07,130 --> 00:10:11,662 which this holds to one more to the current value N. And all we have to do is 145 00:10:11,662 --> 00:10:15,637 follow our nose. So pretty much, we, we have to start on the left hand side with T 146 00:10:15,637 --> 00:10:19,711 of N, and we have to wind up on the right hand side with A times N. And pretty much, 147 00:10:19,711 --> 00:10:23,687 at every step of the proof, there's just gonna be one conceivable thing we could 148 00:10:23,687 --> 00:10:27,513 do. So we just follow our nose. We start with what we wanna upper bound, T of N. 149 00:10:27,513 --> 00:10:31,240 Well, what do we got going for us? The only thing we can do at this point is 150 00:10:31,240 --> 00:10:35,364 invoke the recurrence that we were given up here. So we have an upper bound on T of 151 00:10:35,364 --> 00:10:40,617 N in terms of the T value of smaller integers. So we are given that T of N is 152 00:10:40,617 --> 00:10:47,518 at most C times N, plus T of N over five, plus T of seven-tenths N. Of course 153 00:10:47,518 --> 00:10:51,697 ignoring fractions, you would round up or round down, if you wanted to be precise, 154 00:10:51,697 --> 00:10:55,979 and the auxiliary lecture notes are more precise, if you want to see what the gory 155 00:10:55,979 --> 00:11:00,474 details look like. But it's really just exactly the same argument. One just has to 156 00:11:00,474 --> 00:11:04,606 be a little bit more anal about it. So now that we've invoked the recurrence, what 157 00:11:04,606 --> 00:11:08,892 can we possibly do, right? We can't really do any direct manipulation on any of these 158 00:11:08,892 --> 00:11:13,076 three terms. But fortunately, we have this inductive hypothesis. That applies to any 159 00:11:13,076 --> 00:11:17,566 value, any integer which is less than N. So we have her, N/5, that's certainly less 160 00:11:17,566 --> 00:11:21,239 than N. We have 70 percent of N. That's certainly less than N. So we can apply the 161 00:11:21,239 --> 00:11:25,372 inductive hypothesis twice. We already know that these T values are bounded above 162 00:11:25,372 --> 00:11:30,997 by A times their arguments. So T of N over 5's at most A, times N over five. T of 163 00:11:30,997 --> 00:11:36,590 seven-tenths N is at most A, times seven-tenths N. Now we can group terms 164 00:11:36,590 --> 00:11:41,847 together, not we're comparing apples to apples. So we have N, times quantity C, 165 00:11:41,847 --> 00:11:47,657 plus A/5, plus seven-tenths A. Let me just go ahead and group the two A turns 166 00:11:47,657 --> 00:11:52,793 together. And that's gonna be nine-tenths A. No, don't forget where we're going, 167 00:11:52,793 --> 00:11:58,036 what the end goal is. We want a upper bound T of N by AN. So we wanna write that 168 00:11:58,036 --> 00:12:02,882 this is bounded above by A times N. And now you see exactly how I reverse 169 00:12:02,882 --> 00:12:08,191 engineered our choice of A, as a function of the given constant C. Since A is ten 170 00:12:08,191 --> 00:12:13,103 times as big as C, if I take 90 percent of A and add C, I just get A back. So by our 171 00:12:13,103 --> 00:12:20,509 choice of A. This equals AN. And that pretty much wraps things up. So don't 172 00:12:20,509 --> 00:12:25,386 forget what all this stuff stands for. So what did we just prove? What did we just 173 00:12:25,386 --> 00:12:30,082 prove by induction? We proved T of N is, at most, a constant times N for every N. 174 00:12:30,082 --> 00:12:34,778 That is, T of N is Big O of N. What was T of N? That was the running time of our 175 00:12:34,778 --> 00:12:39,294 algorithm. That's all we cared about. So because T of N is Big O of N, indeed, 176 00:12:39,294 --> 00:12:41,040 deselect runs in O of N time.