1 00:00:00,000 --> 00:00:03,592 In this video I'll explain the mathematical analysis of the randomized 2 00:00:03,592 --> 00:00:07,133 linear time selection algorithm that we studied in the previous video. 3 00:00:07,133 --> 00:00:10,776 Specifically, I'm going to prove to you the following guarantee for that 4 00:00:10,776 --> 00:00:14,571 algorithm. For every single input array of length n the running time of this 5 00:00:14,571 --> 00:00:18,568 randomized selection algorithm on average will be linear. Pretty amazing if you 6 00:00:18,568 --> 00:00:22,717 think about it because that's barely more what the time it takes just to read the 7 00:00:22,717 --> 00:00:26,866 input. And in particular this linear time algorithm is even faster than sorting. 8 00:00:26,866 --> 00:00:30,559 So this shows that selection is a fundamentally easier problem than sorting. 9 00:00:30,559 --> 00:00:34,699 You don't need to reduce to sorting. You can solve it directly in O(n) time. 10 00:00:34,699 --> 00:00:39,011 I want to reiterate the same points I made about quick sort. The 11 00:00:39,011 --> 00:00:43,673 guarantee is the same. It is a general purpose subroutine. We make no assumptions 12 00:00:43,673 --> 00:00:48,335 about data. This theorem holds no matter what the input array is. The expectation, 13 00:00:48,335 --> 00:00:52,823 the average that's in the theorem statement is only over the coin flips made 14 00:00:52,823 --> 00:00:57,271 by the algorithm made inside it's code of our own devising. Before we plunge into the 15 00:00:57,271 --> 00:01:00,976 analysis, let me just make sure you remember what the algorithm is. So it's 16 00:01:00,976 --> 00:01:04,731 like quick sort. We partition around a pivot except we only recurse once, not 17 00:01:04,731 --> 00:01:08,786 twice. So we're given an array with some length n. We're looking for the ith order 18 00:01:08,786 --> 00:01:12,703 statistic, the ith smallest element. The base case is obvious. You're not in the 19 00:01:12,703 --> 00:01:16,867 base case; you choose a pivot p, uniformly at random from the input array just like 20 00:01:16,867 --> 00:01:20,776 we did in quick sort. We partition around the pivot just like we did in pic, in 21 00:01:20,776 --> 00:01:24,939 quick sort. That splits the array into a first part of those elements less than the 22 00:01:24,939 --> 00:01:28,798 pivot and the second part of those elements which are bigger than the pivot. 23 00:01:28,798 --> 00:01:33,215 Now, we have a couple of cases. The case which is very unlikely so we don't really 24 00:01:33,215 --> 00:01:36,870 worry about, if we're lucky enough to guess the pivot as the ith order 25 00:01:36,870 --> 00:01:40,726 statistic what we're looking for. That's when the new position j. Of the pivot 26 00:01:40,726 --> 00:01:44,130 element happens to equal I. What we're looking for. Then, of course, we just 27 00:01:44,130 --> 00:01:47,628 return it. That was exactly what we wanted. In the general case, the pivot is 28 00:01:47,628 --> 00:01:51,406 going to be in the position J, which is either bigger than what we're looking for 29 00:01:51,406 --> 00:01:56,023 I, that's when the pivot is too big or J. It's position will be less than the ith order statistic 30 00:01:56,023 --> 00:01:59,708 we're looking for. That's when the pivot is too small. So if the pivot's too big, 31 00:01:59,708 --> 00:02:03,579 if J is bigger than i that when we're looking for is on the left hand 32 00:02:03,579 --> 00:02:07,450 side amongst the elements less than the pivot. So that's where we recurse. We've 33 00:02:07,450 --> 00:02:11,321 thrown out both the pivot and everything to the right of it. That leaves us with an 34 00:02:11,321 --> 00:02:15,088 array of J minus I elements and we're still looking for the ith smallest among 35 00:02:15,088 --> 00:02:18,865 these J minus1 smallest elements. And then the final case, this is what we went 36 00:02:18,865 --> 00:02:22,884 through in the quiz and last video, is if we choose a pivot who's smaller than what 37 00:02:22,884 --> 00:02:26,806 we're looking for, that's when J is less than I, then it means we're safe to throw 38 00:02:26,806 --> 00:02:30,922 out the pivot and everything less than it. We're safe recursing on the second part of 39 00:02:30,922 --> 00:02:34,409 those elements bigger than the pivot. Having thrown out the J's smallest 40 00:02:34,409 --> 00:02:38,675 elements, we're recursing on an element of length of N-J and we're looking for the i-j 41 00:02:38,675 --> 00:02:43,290 smallest element in those that remain, having already thrown out the J 42 00:02:43,290 --> 00:02:47,963 smallest from the input array. So that's randomized selection. Let's discuss why 43 00:02:47,963 --> 00:02:51,753 it's linear time on average. The first thought that you might have, and this 44 00:02:51,753 --> 00:02:55,475 would be a good thought, would be that we should proceed exactly the same way that 45 00:02:55,475 --> 00:02:59,152 we did in Quick Sort. You recall that when we analyzed Quick Sort, we set up these 46 00:02:59,152 --> 00:03:02,511 indicator random variables, x, i ,j determining whether or not a given, pair 47 00:03:02,511 --> 00:03:06,188 of elements got compared at any point in the algorithm. And then we just realized 48 00:03:06,188 --> 00:03:09,638 the sum of the comparisons is just the sum, overall, of these x,i, js. We applied 49 00:03:09,638 --> 00:03:13,315 linearity of expectation and it boiled down to just figuring out the probability 50 00:03:13,315 --> 00:03:16,765 that a given pair of elements gets compared. You can analyze this randomized 51 00:03:16,765 --> 00:03:20,351 selection algorithm in exactly the same way. And it does give you a linear time 52 00:03:20,351 --> 00:03:24,215 bound on average. But it's a little messy. It winds up being not quite as clean as in the 53 00:03:24,215 --> 00:03:28,923 quick sort analysis. Moreover, because of the special structure of the selection 54 00:03:28,923 --> 00:03:33,448 problem, we can proceed in an even more slick way here than the way we did with 55 00:03:33,448 --> 00:03:36,956 quick sort. So, again we'll have some constituent random variables. We'll again 56 00:03:36,956 --> 00:03:40,465 apply linearity of expectation but the definition of those random variables is 57 00:03:40,465 --> 00:03:44,530 going to be a little bit different than it was in quick sort. So, first a preliminary 58 00:03:44,530 --> 00:03:49,550 observation. Which is that the workhorse for this randomized selection procedure is 59 00:03:49,550 --> 00:03:54,267 exactly the same as it was in quick sort. Namely it's the partition subroutine. 60 00:03:54,267 --> 00:03:59,105 Essentially all of the work that gets done outside of the recursive call just 61 00:03:59,105 --> 00:04:03,701 partitions the input array around some pivot element as we discussed in detail in a 62 00:04:03,701 --> 00:04:08,808 separate video that takes linear time. So usually when we say something's linear 63 00:04:08,808 --> 00:04:12,856 time we just use big O notation. I'm gonna go ahead and explicitly use a constant c 64 00:04:12,856 --> 00:04:16,660 here for the operations outside the recursive call. That'll make it clear that 65 00:04:16,660 --> 00:04:20,904 I'm not hiding anything up my sleeves when we do the rest of the analysis. Now what I 66 00:04:20,904 --> 00:04:25,695 wanna do on this slide is introduce some vocabulary, some notation which will allow 67 00:04:25,695 --> 00:04:30,196 us to cleanly track the progress of this recursive selection algorithm. And by 68 00:04:30,196 --> 00:04:34,069 progress I mean. The length of the array on which is currently operating. 69 00:04:34,069 --> 00:04:38,071 Remember we're hoping for a big win over quick sort, cuz here we only do one 70 00:04:38,071 --> 00:04:42,442 recursive call, not two. We don't have to recurse on both sides of the pivot just on 71 00:04:42,442 --> 00:04:46,602 one of them. So it stands to reason, that we can think about the argument making 72 00:04:46,602 --> 00:04:50,499 more and more progress as a single recursive calls operating on arrays of 73 00:04:50,499 --> 00:04:54,817 smaller and smaller length. So the notion that will be important for this proof is 74 00:04:54,817 --> 00:04:59,030 that of a phase. This quantifies how much progress we've made so far, with higher 75 00:04:59,030 --> 00:05:04,543 numbered phases corresponding to more and more progress. We'll say that the 76 00:05:04,543 --> 00:05:11,625 r select algorithm at some midpoint of its execution is in the middle of phase J. 77 00:05:11,625 --> 00:05:18,125 If the array size that the current recursive call is working on has length between 78 00:05:18,125 --> 00:05:24,541 3/4th raised to the J times N and the smaller number 3/4th J+1 times N. For example 79 00:05:24,541 --> 00:05:31,040 think about the case where J equals zero. That says phase zero recursive calls, 80 00:05:31,040 --> 00:05:36,709 operate on arrays with size of N and 75 percent of N. So, certainly, the outermost 81 00:05:36,709 --> 00:05:40,864 recursive call is going to be in phase zero. Because the input array has size N. 82 00:05:40,864 --> 00:05:45,176 And then, depending on the choice of the pivot, you may or may not get out of phase 83 00:05:45,176 --> 00:05:49,120 zero in the next recursive call. If you choose a good pivot, and you wind up 84 00:05:49,120 --> 00:05:53,485 recursing on something, that has, at most, 75 percent of the original elements, you 85 00:05:53,485 --> 00:05:57,797 will no longer be in phase zero. If you recurse on something that has more than 75 86 00:05:57,797 --> 00:06:02,159 percent of what you started with, of the. Input array, then you're still gonna be in 87 00:06:02,159 --> 00:06:06,944 phase zero even in the second recursive call. So overall the phase number J, 88 00:06:06,944 --> 00:06:11,410 quantifies the number of times we've made 75 percent progress, relative to the 89 00:06:11,410 --> 00:06:15,725 original input array. And the other piece of notation that's going to be important 90 00:06:15,725 --> 00:06:20,113 is what I'm going to call Xj. So for a phase J, Xj simply counts the number of 91 00:06:20,113 --> 00:06:25,068 recursive calls in which a randomized selection algorithm is in phase J. So this 92 00:06:25,068 --> 00:06:30,024 is gonna be some integer. It could be as small as zero, if you think about it, for 93 00:06:30,024 --> 00:06:34,894 some of the phases. Or it could be larger. So why am I doing this? Why am I making 94 00:06:34,894 --> 00:06:38,963 these definitions of phases and of these XJs? What's the point? We're gonna 95 00:06:38,963 --> 00:06:43,582 remember the point was we wanna be able to cleanly talk about the progress that the 96 00:06:43,582 --> 00:06:47,762 randomized selection algorithm makes through its recursion, and what I wanna 97 00:06:47,762 --> 00:06:52,161 now show you is that in terms of these XJs, counting the number of iterations in 98 00:06:52,161 --> 00:06:56,175 each phase, we can derive a relatively simple upper bound on the number of 99 00:06:56,175 --> 00:07:00,559 operations that our algorithm requires. Specifically the running time of our 100 00:07:00,559 --> 00:07:05,219 algorithm, can be bounded above by the running time in a given phase, and then 101 00:07:05,219 --> 00:07:10,243 summing those quantities over all of the possible phases. So we're gonna start with 102 00:07:10,243 --> 00:07:15,311 a big sum, over all the phases J. We want to look at the number of recursive calls 103 00:07:15,311 --> 00:07:20,295 that we have to endure in phase J, so that's XJ by definition. And then we look 104 00:07:20,295 --> 00:07:25,215 at the work that we do outside of the recursive calls in each recursive call 105 00:07:25,215 --> 00:07:30,391 during phase J. Now, in a given recursive call, outside of its recursive call, we do 106 00:07:30,391 --> 00:07:35,823 C times M operations where M is the length of the input array and during phase J we 107 00:07:35,823 --> 00:07:40,807 have an upper bound on the link of the input array. By definition it's at most 108 00:07:40,807 --> 00:07:46,400 three quarters raised to the J times N. So that is, we multiply the running time 109 00:07:46,400 --> 00:07:52,600 times this constant C this, we inherit from the partition subroutine and then we 110 00:07:52,600 --> 00:07:58,800 can, for the input length, we can put an upper bound of three quarters raised to 111 00:07:58,800 --> 00:08:05,077 the J times N. So just to review where all of these terms come from, there's three 112 00:08:05,077 --> 00:08:11,500 quarters J times N is an upper bound on the array size. During phase J, this by 113 00:08:11,500 --> 00:08:17,957 the definition of the phase. Then, if we multiply that times c, that's the amount 114 00:08:17,957 --> 00:08:23,277 of work that we do on each phase J sub-problem. How much work do we do in 115 00:08:23,277 --> 00:08:27,767 phase J overall or we just take the work per sub problem that's what's circled in 116 00:08:27,767 --> 00:08:32,255 yellow and we multiply it times the number of such sub problems we have. And, of 117 00:08:32,255 --> 00:08:36,769 course, we don't wanna forget any of our sub problems so we just make sure we sum 118 00:08:36,769 --> 00:08:41,340 all of our phases, j, to insure that at every point we count the work done in each 119 00:08:41,340 --> 00:08:45,742 of the sub problems. Okay? So, that's the upshot of this slide. We can upper bound 120 00:08:45,742 --> 00:08:50,030 the running time of our randomized algorithm very simply in terms of phases 121 00:08:50,030 --> 00:08:54,545 and the XJ's, the number of sub problems that we have to endure during phase j. So, 122 00:08:54,545 --> 00:08:59,379 this upper bound on our running time is important enough to give it notation, we'll 123 00:08:59,379 --> 00:09:03,553 call this star, this will be the starting point of our final derivation when we 124 00:09:03,553 --> 00:09:07,832 complete the proof of this theorem. Now don't forget, we're analyzing a randomized 125 00:09:07,832 --> 00:09:12,322 algorithm so therefore the left hand side of this inequality the running time of r select, 126 00:09:12,322 --> 00:09:16,443 that's a random variable. So that's a different number depending on the 127 00:09:16,443 --> 00:09:20,775 outcome of the random coin flips of the algorithm. Depending on the random pick it 128 00:09:20,775 --> 00:09:25,002 has chosen, you will get different random running times. Similarly the right hand 129 00:09:25,002 --> 00:09:30,191 side of this inequality. Is also a random variable. That's because the X J's are 130 00:09:30,191 --> 00:09:35,252 random variables. The number of sub problems in phase j depends on which 131 00:09:35,252 --> 00:09:39,937 pivots get chosen. So. To analyze, what we care about is the expectations of these 132 00:09:39,937 --> 00:09:44,260 quantities, their average values. So we're gonna start modestly and as usual, this 133 00:09:44,422 --> 00:09:49,123 will extend our modest accomplishments to much more impressive ones using linearity 134 00:09:49,123 --> 00:09:53,013 of expectation, but our first modest goal is just to, to understand the 135 00:09:53,013 --> 00:09:57,078 average value. Of an XJ, the expected value of XJ. We're gonna do that in two 136 00:09:57,078 --> 00:10:01,310 steps. On the next slide, I'm going to argue that to analyze the expectation of 137 00:10:01,310 --> 00:10:05,649 XJ, it's sufficient to understand the expectation of a very simple coin flipping 138 00:10:05,649 --> 00:10:09,989 experiment. Then, we'll analyze that coin flipping experiment. Then we'll have the 139 00:10:09,989 --> 00:10:14,057 dominos all set up in a row. And on the final slide, we'll knock'em down and 140 00:10:14,057 --> 00:10:18,398 finish the proof. So let's try to understand the average number of recursive 141 00:10:18,398 --> 00:10:23,112 calls we expect to see in a given phase. So, again, just so we don't forget. Xj is 142 00:10:23,112 --> 00:10:28,751 defined as the number of recursive calls during phase J. Where a recursive call is 143 00:10:28,751 --> 00:10:34,321 in phase J, if and only if the current sub array length lies between three-fourths 144 00:10:34,321 --> 00:10:39,547 raised to the J+1 times N. And then, the larger number of three-fourths raised to 145 00:10:39,547 --> 00:10:44,320 the J times N. So again, for example, phase zero is just the recursive calls 146 00:10:44,320 --> 00:10:48,856 under which the array length is between 75 percent of the original element and 100 147 00:10:48,856 --> 00:10:53,331 percent of the original elements. So what I wanna do next is point out that a very 148 00:10:53,331 --> 00:10:58,291 simple sufficient condition guarantees that we'll proceed from a given phase onto 149 00:10:58,291 --> 00:11:02,827 the next phase. So it's a condition guaranteeing termination of the current 150 00:11:02,827 --> 00:11:07,423 phase. And it's an event that we've discussed in previous videos. Mainly that 151 00:11:07,423 --> 00:11:11,536 the pivot that we choose gives a reasonably balanced split. 25-75 or 152 00:11:11,536 --> 00:11:16,671 better. So recall how partitioning works, we choose a pivot P. It winds up wherever 153 00:11:16,671 --> 00:11:21,269 it winds up. And the stuff to the left of it's less than P. The stuff to the right 154 00:11:21,269 --> 00:11:25,752 of it is bigger than P. So 25 to 75 split or better, what I mean is that each of 155 00:11:25,752 --> 00:11:29,612 these, each, the first part and the second part has, at most, 75 percent of the 156 00:11:29,612 --> 00:11:33,925 elements in the input array. Both have twen-, both have at least 25%, and, at 157 00:11:33,925 --> 00:11:38,591 most, 65%. And the key point is, that if we wind up choosing a pivot that gives us 158 00:11:38,591 --> 00:11:42,863 a split that's at least as good the current phase must end. Why must the 159 00:11:42,863 --> 00:11:47,609 current phase end? Well, to get a 25, 75 split or better than no matter which case 160 00:11:47,609 --> 00:11:52,354 we wind up in, in the algorithm we're guaranteed to recurse on a sub problem that 161 00:11:52,354 --> 00:11:57,278 has at most 75 percent of what we started with. That guarantees that whatever phase 162 00:11:57,278 --> 00:12:02,206 we're in now, we're going to be in an even bigger phase when we recursed. Now, I want 163 00:12:02,206 --> 00:12:07,038 you to remember something that we talked about before, which is that you've got a 164 00:12:07,038 --> 00:12:12,050 decent chance when you pick a random pivot of getting something that gives you a 25, 165 00:12:12,050 --> 00:12:16,226 75 split or better. In fact, the probability is 50 percent. Right? If you 166 00:12:16,226 --> 00:12:21,118 have an array that has the integers from one to 100 inclusive, anything from 76 to 167 00:12:21,118 --> 00:12:26,248 s, 26 to 75 will do the trick. That'll insure that at least the first 25 elements 168 00:12:26,248 --> 00:12:30,782 are excluded from the rightmost call and at least rightmost 25 elements are 169 00:12:30,782 --> 00:12:36,089 excluded from the left recursive call. So this is why we can reduce our analysis of 170 00:12:36,089 --> 00:12:41,088 the number of recursive calls during a given phase, to a simple experiment 171 00:12:41,088 --> 00:12:46,290 involving flipping coins. Specifically, the expected number of recursive calls. 172 00:12:46,290 --> 00:12:51,113 Now we are gonna see in a given phase J, is no more than the expected number of 173 00:12:51,113 --> 00:12:55,570 coin flips in the following experiment. Okay, so you've got a fair coin, 50 174 00:12:55,570 --> 00:12:59,172 percent heads, 50 percent tails. You commit to flipping it until you see the 175 00:12:59,172 --> 00:13:04,179 head and the question is, how many coin flips does it take up to and including the 176 00:13:04,179 --> 00:13:09,064 first head that you see? So the minimum it's gonna be one coin flip if you hit a 177 00:13:09,064 --> 00:13:14,192 head the first time it's one. If you get a tails and then a head, then it's two. If 178 00:13:14,192 --> 00:13:19,138 it's tails, tails, head it's three and so on, and you always stop when you hit that 179 00:13:19,138 --> 00:13:23,764 first head. So what's the correspondence? Well, think of heads as being you're in 180 00:13:23,764 --> 00:13:28,358 Phase J, and if you get a good pivot, it gives you a 25/75 split. Call that heads. 181 00:13:28,358 --> 00:13:33,301 And it guarantees that you exit this Phase J. Just like it guarantees that you get to 182 00:13:33,301 --> 00:13:37,895 terminate the coin flipping experience, experiment. Now, if you get a pivot which 183 00:13:37,895 --> 00:13:42,663 doesn't give you a 25/75 split, you may or may not pass to a higher Phase J, but in 184 00:13:42,663 --> 00:13:47,315 the worst case, you don't. You stick to phase J is you get a bad split, and that's 185 00:13:47,315 --> 00:13:51,910 like getting a tails in the coin flipping experiment, and you have to try again. 186 00:13:51,910 --> 00:13:56,154 This correspondence give us a very elementary way to think about the progress 187 00:13:56,154 --> 00:13:59,964 that, that our randomized selection algorithm is making. So, there's one 188 00:13:59,964 --> 00:14:04,263 recursive call in every step in our algorithm, and each time we either choose 189 00:14:04,263 --> 00:14:08,453 a good pivot or a bad pivot, both could happen, 50-50 probability. A good pivot 190 00:14:08,453 --> 00:14:12,698 means we get a 75-25 split or better. A bad pivot means, by definition, we get a 191 00:14:12,698 --> 00:14:17,308 split worse than 25-75. So what have we accomplished? We've reduced the task of 192 00:14:17,308 --> 00:14:21,581 upper bounding the expected number of recursive calls in a phase J to 193 00:14:21,581 --> 00:14:26,649 understanding the expected number of times you have to flip a fair coin before you 194 00:14:26,649 --> 00:14:31,595 get one hit. So on the next slide we'll give you the classical and precise answer 195 00:14:31,595 --> 00:14:35,775 to this coin flipping experiment. So, let me use capital N to denote the random 196 00:14:35,775 --> 00:14:39,433 variable, which we were just talking about, the number of coin flips you need 197 00:14:39,433 --> 00:14:43,283 to do before you see the first heads. And, it's not very important, but you should 198 00:14:43,283 --> 00:14:47,085 know that these random variables have their own name. This would be a geometric 199 00:14:47,085 --> 00:14:51,263 random variable with parameter one-half. So you can use a few different methods to 200 00:14:51,263 --> 00:14:55,608 compute the expected value of a geometric random variable such as this, and brute 201 00:14:55,608 --> 00:14:59,899 force using the definition of expectation works fine as long as you know how to 202 00:15:00,060 --> 00:15:04,674 manipulate infinite sums. But for the sake of variety, let me give you a very sneaky 203 00:15:04,674 --> 00:15:08,751 proof of what it's expectation is. So the sneaky approach is to write to the 204 00:15:08,751 --> 00:15:13,096 expected value of this random variable in terms of itself and then solve for the 205 00:15:13,096 --> 00:15:17,173 unknown, solve for the expectation. So let's think about it. So how many coins 206 00:15:17,173 --> 00:15:21,455 flips do you need? Well for sure you're gonna need one. That's the best case 207 00:15:21,455 --> 00:15:26,467 scenario. And now two things can happen, either you get heads and that has 50 208 00:15:26,467 --> 00:15:30,358 percent probability you stop or you get tails that happens with 50 percent 209 00:15:30,358 --> 00:15:35,700 probability and now you start all over again. Again you just put points until you 210 00:15:35,700 --> 00:15:41,174 get first heads. On average how many times does that take. Well by the definition of 211 00:15:41,174 --> 00:15:46,400 capital N you expect. The expectation of N coin flips, in the case where you get 212 00:15:46,400 --> 00:15:51,400 tails, and you have to start all over. So this one represents the first coin flip, 213 00:15:51,400 --> 00:15:56,086 the one-half is the probability that you can't stop, that you have to start all 214 00:15:56,086 --> 00:16:00,356 over probability of tails, and then because it's a memory less process, 215 00:16:00,356 --> 00:16:05,338 because when you start anew on the second coin flip having gotten the tails, it's as 216 00:16:05,338 --> 00:16:10,250 if you're back at time one all over again. So now we have a trivial equation, in 217 00:16:10,250 --> 00:16:15,614 terms of the unknown expected value of N and the unique solution, the unique value, 218 00:16:15,614 --> 00:16:20,913 that the expected value of capital N could have, in light of this equation, is two. 219 00:16:20,913 --> 00:16:26,277 So, on average if you flip a fair coin and stop when you get heads, you're going to 220 00:16:26,277 --> 00:16:30,902 see two coin flips on average. To make sure you haven't sort of lost the forest 221 00:16:30,902 --> 00:16:35,085 for the trees, let me remind you why we were talking about this coin flipping 222 00:16:35,085 --> 00:16:39,540 analysis in the first place. So recall in the previous slide we showed that XJ, and 223 00:16:39,540 --> 00:16:43,996 remember XJ is the number of recursive calls you'd expect to see in a given phase 224 00:16:43,996 --> 00:16:48,234 J, and we argued that the number of recursive calls you're gonna see is bounded 225 00:16:48,234 --> 00:16:52,434 above. By the expected number of coin flips until the heads. So this exact 226 00:16:52,434 --> 00:16:56,886 calculation of two for the coin flips gives us an upper bound of two for the 227 00:16:56,886 --> 00:17:01,512 number of recursive calls on average in any given phase J. So now that we've got 228 00:17:01,512 --> 00:17:06,080 all our ducks lined up in a row, let's wrap up the proof on this final slide. So, 229 00:17:06,080 --> 00:17:10,664 inherited from part one of the proof, we have an upper bound. On the expected 230 00:17:10,664 --> 00:17:16,023 running time. Of the R select algorithm. This is what we were calling star on the 231 00:17:16,023 --> 00:17:21,165 first brief slide In star, it looked a little messy, but we had the sum over the 232 00:17:21,165 --> 00:17:25,979 phases J. But we had two things that were independent of j: the constant c and the 233 00:17:25,979 --> 00:17:30,846 original input length n, so let me just yank the c and the n out front. And then 234 00:17:30,846 --> 00:17:36,209 we have this residual sum over the phases J. Of three quarters raised to the J 235 00:17:36,209 --> 00:17:40,926 remember that comes from our upper bound on the sub problem size during phase J and 236 00:17:40,926 --> 00:17:45,305 then of course we have to keep track of how many phase J sub problems we have 237 00:17:45,305 --> 00:17:50,303 solved that by definition is XJ. Now star was written as a rand in accordance terms 238 00:17:50,303 --> 00:17:54,739 to the random variables. Now we're gonna go ahead and take the expectations and 239 00:17:54,739 --> 00:17:59,400 again I have said this over and over but don't forget where's the expectation come 240 00:17:59,400 --> 00:18:03,162 from. This is over the random pivot choices that our code makes. So the 241 00:18:03,162 --> 00:18:07,654 expected running time of the algorithm is most the expectation of this start 242 00:18:07,654 --> 00:18:11,424 quantity. So like I said earlier, pretty much every time we're gonna do any 243 00:18:11,424 --> 00:18:14,989 analysis of [inaudible] process, we're gonna wind up using linearity of 244 00:18:14,989 --> 00:18:19,438 expectation at some point. Here is where we do it. Linear expectation says the 245 00:18:19,438 --> 00:18:25,261 expectation of a sum is just the sum of the expectations. So we yank the c and the 246 00:18:25,261 --> 00:18:30,500 n outside of the expectation. We yank this sum over phases. Outside of the 247 00:18:30,500 --> 00:18:36,628 expectation. We yank this three-fourths raised to the J outside of the expectation 248 00:18:36,628 --> 00:18:42,532 and then we just have the expected value of XJ, the average number of recursive 249 00:18:42,532 --> 00:18:48,113 calls we expect to see in HJ. Now on the previous two slides, we figured out an 250 00:18:48,113 --> 00:18:53,896 upper bound on how many recursive calls we expect to see in each phase. So first by 251 00:18:53,896 --> 00:18:59,330 the coin flip analysis, by the reduction of the coin flip analysis, this is the 252 00:18:59,330 --> 00:19:04,833 most expected number of coin flips N, which on the previous slide, we argued was 253 00:19:04,833 --> 00:19:10,958 exactly two. So bringing that two out in front of the sum, that no longer depends 254 00:19:10,958 --> 00:19:16,900 on J. So we get a most 2CN. Times the sum over phases J, of three quarters raised to 255 00:19:16,900 --> 00:19:21,567 the J. Now this kind of sum we have seen previously in the course. It came up when 256 00:19:21,567 --> 00:19:26,267 we were analyzing the master method and we summed up our running time upper bounds 257 00:19:26,267 --> 00:19:30,797 over the levels of our recursin tree. And if we're not in case one if we're in 258 00:19:30,797 --> 00:19:35,101 case two or three we had geometric sums that were nontrivial. They require a 259 00:19:35,101 --> 00:19:39,631 certain formula to calculate, so let me remind you of that formula here, when the 260 00:19:39,631 --> 00:19:44,831 three quarters are being powered up to the J. So this has value at most, one over one 261 00:19:44,831 --> 00:19:50,269 minus, the number that's getting powered, so in this case it's three quarters, one 262 00:19:50,269 --> 00:19:55,766 minus three quarters is a quarter check reciprocal, you got four. And the upshot 263 00:19:55,766 --> 00:20:02,108 is that the expected number of operations that this randomized selection algorithm 264 00:20:02,108 --> 00:20:08,145 uses to find the [inaudible] ordered statistic in a given input array, is eight 265 00:20:08,145 --> 00:20:14,258 times C times N. Where C is the, hidden constant in the linear running time of 266 00:20:14,258 --> 00:20:19,989 partition. And so that completes the proof. The input array was arbitrary. We 267 00:20:19,989 --> 00:20:26,407 showed the expected running time over the random choices of the algorithm is linear 268 00:20:26,407 --> 00:20:32,263 in N. That is, only a constant factor larger than what is required to read the 269 00:20:32,263 --> 00:20:33,860 input. Pretty amazing.