1
00:00:00,000 --> 00:00:03,592
In this video I'll explain the
mathematical analysis of the randomized

2
00:00:03,592 --> 00:00:07,133
linear time selection algorithm that we
studied in the previous video.

3
00:00:07,133 --> 00:00:10,776
Specifically, I'm going to prove to you
the following guarantee for that

4
00:00:10,776 --> 00:00:14,571
algorithm. For every single input array of
length n the running time of this

5
00:00:14,571 --> 00:00:18,568
randomized selection algorithm on average
will be linear. Pretty amazing if you

6
00:00:18,568 --> 00:00:22,717
think about it because that's barely more
what the time it takes just to read the

7
00:00:22,717 --> 00:00:26,866
input. And in particular this linear time
algorithm is even faster than sorting.

8
00:00:26,866 --> 00:00:30,559
So this shows that selection is a
fundamentally easier problem than sorting.

9
00:00:30,559 --> 00:00:34,699
You don't need to reduce to sorting. You
can solve it directly in O(n) time.

10
00:00:34,699 --> 00:00:39,011
I want to reiterate the same
points I made about quick sort. The

11
00:00:39,011 --> 00:00:43,673
guarantee is the same. It is a general
purpose subroutine. We make no assumptions

12
00:00:43,673 --> 00:00:48,335
about data. This theorem holds no matter
what the input array is. The expectation,

13
00:00:48,335 --> 00:00:52,823
the average that's in the theorem
statement is only over the coin flips made

14
00:00:52,823 --> 00:00:57,271
by the algorithm made inside it's code of our
own devising. Before we plunge into the

15
00:00:57,271 --> 00:01:00,976
analysis, let me just make sure you
remember what the algorithm is. So it's

16
00:01:00,976 --> 00:01:04,731
like quick sort. We partition around a
pivot except we only recurse once, not

17
00:01:04,731 --> 00:01:08,786
twice. So we're given an array with some
length n. We're looking for the ith order

18
00:01:08,786 --> 00:01:12,703
statistic, the ith smallest element. The
base case is obvious. You're not in the

19
00:01:12,703 --> 00:01:16,867
base case; you choose a pivot p, uniformly
at random from the input array just like

20
00:01:16,867 --> 00:01:20,776
we did in quick sort. We partition around
the pivot just like we did in pic, in

21
00:01:20,776 --> 00:01:24,939
quick sort. That splits the array into a
first part of those elements less than the

22
00:01:24,939 --> 00:01:28,798
pivot and the second part of those
elements which are bigger than the pivot.

23
00:01:28,798 --> 00:01:33,215
Now, we have a couple of cases. The case
which is very unlikely so we don't really

24
00:01:33,215 --> 00:01:36,870
worry about, if we're lucky enough to
guess the pivot as the ith order

25
00:01:36,870 --> 00:01:40,726
statistic what we're looking for. That's
when the new position j. Of the pivot

26
00:01:40,726 --> 00:01:44,130
element happens to equal I. What we're
looking for. Then, of course, we just

27
00:01:44,130 --> 00:01:47,628
return it. That was exactly what we
wanted. In the general case, the pivot is

28
00:01:47,628 --> 00:01:51,406
going to be in the position J, which is
either bigger than what we're looking for

29
00:01:51,406 --> 00:01:56,023
I, that's when the pivot is too big or J.
It's position will be less than the ith order statistic

30
00:01:56,023 --> 00:01:59,708
we're looking for. That's when the pivot
is too small. So if the pivot's too big,

31
00:01:59,708 --> 00:02:03,579
if J is bigger than i  that when
we're looking for is on the left hand

32
00:02:03,579 --> 00:02:07,450
side amongst the elements less than the
pivot. So that's where we recurse. We've

33
00:02:07,450 --> 00:02:11,321
thrown out both the pivot and everything
to the right of it. That leaves us with an

34
00:02:11,321 --> 00:02:15,088
array of J minus I elements and we're
still looking for the ith smallest among

35
00:02:15,088 --> 00:02:18,865
these J minus1 smallest elements. And then
the final case, this is what we went

36
00:02:18,865 --> 00:02:22,884
through in the quiz and last video, is if
we choose a pivot who's smaller than what

37
00:02:22,884 --> 00:02:26,806
we're looking for, that's when J is less
than I, then it means we're safe to throw

38
00:02:26,806 --> 00:02:30,922
out the pivot and everything less than it.
We're safe recursing on the second part of

39
00:02:30,922 --> 00:02:34,409
those elements bigger than the pivot.
Having thrown out the J's smallest

40
00:02:34,409 --> 00:02:38,675
elements, we're recursing on an element of
length of N-J and we're looking for the i-j

41
00:02:38,675 --> 00:02:43,290
smallest element in those that
remain, having already thrown out the J

42
00:02:43,290 --> 00:02:47,963
smallest from the input array. So that's
randomized selection. Let's discuss why

43
00:02:47,963 --> 00:02:51,753
it's linear time on average. The first
thought that you might have, and this

44
00:02:51,753 --> 00:02:55,475
would be a good thought, would be that we
should proceed exactly the same way that

45
00:02:55,475 --> 00:02:59,152
we did in Quick Sort. You recall that when
we analyzed Quick Sort, we set up these

46
00:02:59,152 --> 00:03:02,511
indicator random variables, x, i ,j
determining whether or not a given, pair

47
00:03:02,511 --> 00:03:06,188
of elements got compared at any point in
the algorithm. And then we just realized

48
00:03:06,188 --> 00:03:09,638
the sum of the comparisons is just the
sum, overall, of these x,i, js. We applied

49
00:03:09,638 --> 00:03:13,315
linearity of expectation and it boiled
down to just figuring out the probability

50
00:03:13,315 --> 00:03:16,765
that a given pair of elements gets
compared. You can analyze this randomized

51
00:03:16,765 --> 00:03:20,351
selection algorithm in exactly the same
way. And it does give you a linear time

52
00:03:20,351 --> 00:03:24,215
bound on average. But it's a little messy.
It winds up being not quite as clean as in the

53
00:03:24,215 --> 00:03:28,923
quick sort analysis. Moreover, because of
the special structure of the selection

54
00:03:28,923 --> 00:03:33,448
problem, we can proceed in an even more
slick way here than the way we did with

55
00:03:33,448 --> 00:03:36,956
quick sort. So, again we'll have some
constituent random variables. We'll again

56
00:03:36,956 --> 00:03:40,465
apply linearity of expectation but the
definition of those random variables is

57
00:03:40,465 --> 00:03:44,530
going to be a little bit different than it
was in quick sort. So, first a preliminary

58
00:03:44,530 --> 00:03:49,550
observation. Which is that the workhorse
for this randomized selection procedure is

59
00:03:49,550 --> 00:03:54,267
exactly the same as it was in quick sort.
Namely it's the partition subroutine.

60
00:03:54,267 --> 00:03:59,105
Essentially all of the work that gets done
outside of the recursive call just

61
00:03:59,105 --> 00:04:03,701
partitions the input array around some pivot
element as we discussed in detail in a

62
00:04:03,701 --> 00:04:08,808
separate video that takes linear time. So
usually when we say something's linear

63
00:04:08,808 --> 00:04:12,856
time we just use big O notation. I'm gonna
go ahead and explicitly use a constant c

64
00:04:12,856 --> 00:04:16,660
here for the operations outside the
recursive call. That'll make it clear that

65
00:04:16,660 --> 00:04:20,904
I'm not hiding anything up my sleeves when
we do the rest of the analysis. Now what I

66
00:04:20,904 --> 00:04:25,695
wanna do on this slide is introduce some
vocabulary, some notation which will allow

67
00:04:25,695 --> 00:04:30,196
us to cleanly track the progress of this
recursive selection algorithm. And by

68
00:04:30,196 --> 00:04:34,069
progress I mean. The length of the
array on which is currently operating.

69
00:04:34,069 --> 00:04:38,071
Remember we're hoping for a big win over
quick sort, cuz here we only do one

70
00:04:38,071 --> 00:04:42,442
recursive call, not two. We don't have to
recurse on both sides of the pivot just on

71
00:04:42,442 --> 00:04:46,602
one of them. So it stands to reason, that
we can think about the argument making

72
00:04:46,602 --> 00:04:50,499
more and more progress as a single
recursive calls operating on arrays of

73
00:04:50,499 --> 00:04:54,817
smaller and smaller length. So the notion
that will be important for this proof is

74
00:04:54,817 --> 00:04:59,030
that of a phase. This quantifies how much
progress we've made so far, with higher

75
00:04:59,030 --> 00:05:04,543
numbered phases corresponding to more and
more progress. We'll say that the

76
00:05:04,543 --> 00:05:11,625
r select algorithm at some midpoint of
its execution is in the middle of phase J.

77
00:05:11,625 --> 00:05:18,125
If the array size that the current
recursive call is working on has length between

78
00:05:18,125 --> 00:05:24,541
3/4th raised to the J times N and the
smaller number 3/4th J+1 times N. For example

79
00:05:24,541 --> 00:05:31,040
think about the case where J equals zero.
That says phase zero recursive calls,

80
00:05:31,040 --> 00:05:36,709
operate on arrays with size of N and 75
percent of N. So, certainly, the outermost

81
00:05:36,709 --> 00:05:40,864
recursive call is going to be in phase
zero. Because the input array has size N.

82
00:05:40,864 --> 00:05:45,176
And then, depending on the choice of the
pivot, you may or may not get out of phase

83
00:05:45,176 --> 00:05:49,120
zero in the next recursive call. If you
choose a good pivot, and you wind up

84
00:05:49,120 --> 00:05:53,485
recursing on something, that has, at most,
75 percent of the original elements, you

85
00:05:53,485 --> 00:05:57,797
will no longer be in phase zero. If you
recurse on something that has more than 75

86
00:05:57,797 --> 00:06:02,159
percent of what you started with, of the.
Input array, then you're still gonna be in

87
00:06:02,159 --> 00:06:06,944
phase zero even in the second recursive
call. So overall the phase number J,

88
00:06:06,944 --> 00:06:11,410
quantifies the number of times we've made
75 percent progress, relative to the

89
00:06:11,410 --> 00:06:15,725
original input array. And the other piece
of notation that's going to be important

90
00:06:15,725 --> 00:06:20,113
is what I'm going to call Xj. So for a
phase J, Xj simply counts the number of

91
00:06:20,113 --> 00:06:25,068
recursive calls in which a randomized
selection algorithm is in phase J. So this

92
00:06:25,068 --> 00:06:30,024
is gonna be some integer. It could be as
small as zero, if you think about it, for

93
00:06:30,024 --> 00:06:34,894
some of the phases. Or it could be larger.
So why am I doing this? Why am I making

94
00:06:34,894 --> 00:06:38,963
these definitions of phases and of these
XJs? What's the point? We're gonna

95
00:06:38,963 --> 00:06:43,582
remember the point was we wanna be able to
cleanly talk about the progress that the

96
00:06:43,582 --> 00:06:47,762
randomized selection algorithm makes
through its recursion, and what I wanna

97
00:06:47,762 --> 00:06:52,161
now show you is that in terms of these
XJs, counting the number of iterations in

98
00:06:52,161 --> 00:06:56,175
each phase, we can derive a relatively
simple upper bound on the number of

99
00:06:56,175 --> 00:07:00,559
operations that our algorithm requires.
Specifically the running time of our

100
00:07:00,559 --> 00:07:05,219
algorithm, can be bounded above by the
running time in a given phase, and then

101
00:07:05,219 --> 00:07:10,243
summing those quantities over all of the
possible phases. So we're gonna start with

102
00:07:10,243 --> 00:07:15,311
a big sum, over all the phases J. We want
to look at the number of recursive calls

103
00:07:15,311 --> 00:07:20,295
that we have to endure in phase J, so
that's XJ by definition. And then we look

104
00:07:20,295 --> 00:07:25,215
at the work that we do outside of the
recursive calls in each recursive call

105
00:07:25,215 --> 00:07:30,391
during phase J. Now, in a given recursive
call, outside of its recursive call, we do

106
00:07:30,391 --> 00:07:35,823
C times M operations where M is the length
of the input array and during phase J we

107
00:07:35,823 --> 00:07:40,807
have an upper bound on the link of the
input array. By definition it's at most

108
00:07:40,807 --> 00:07:46,400
three quarters raised to the J times N. So
that is, we multiply the running time

109
00:07:46,400 --> 00:07:52,600
times this constant C this, we inherit
from the partition subroutine and then we

110
00:07:52,600 --> 00:07:58,800
can, for the input length, we can put an
upper bound of three quarters raised to

111
00:07:58,800 --> 00:08:05,077
the J times N. So just to review where all
of these terms come from, there's three

112
00:08:05,077 --> 00:08:11,500
quarters J times N is an upper bound on
the array size. During phase J, this by

113
00:08:11,500 --> 00:08:17,957
the definition of the phase. Then, if we
multiply that times c, that's the amount

114
00:08:17,957 --> 00:08:23,277
of work that we do on each phase J
sub-problem. How much work do we do in

115
00:08:23,277 --> 00:08:27,767
phase J overall or we just take the work per
sub problem that's what's circled in

116
00:08:27,767 --> 00:08:32,255
yellow and we multiply it times the number
of such sub problems we have. And, of

117
00:08:32,255 --> 00:08:36,769
course, we don't wanna forget any of our
sub problems so we just make sure we sum

118
00:08:36,769 --> 00:08:41,340
all of our phases, j, to insure that at
every point we count the work done in each

119
00:08:41,340 --> 00:08:45,742
of the sub problems. Okay? So, that's the
upshot of this slide. We can upper bound

120
00:08:45,742 --> 00:08:50,030
the running time of our randomized
algorithm very simply in terms of phases

121
00:08:50,030 --> 00:08:54,545
and the XJ's, the number of sub problems
that we have to endure during phase j. So,

122
00:08:54,545 --> 00:08:59,379
this upper bound on our running time is
important enough to give it notation, we'll

123
00:08:59,379 --> 00:09:03,553
call this star, this will be the starting
point of our final derivation when we

124
00:09:03,553 --> 00:09:07,832
complete the proof of this theorem. Now
don't forget, we're analyzing a randomized

125
00:09:07,832 --> 00:09:12,322
algorithm so therefore the left hand side
of this inequality the running time of r select,

126
00:09:12,322 --> 00:09:16,443
that's a random variable. So
that's a different number depending on the

127
00:09:16,443 --> 00:09:20,775
outcome of the random coin flips of the
algorithm. Depending on the random pick it

128
00:09:20,775 --> 00:09:25,002
has chosen, you will get different random
running times. Similarly the right hand

129
00:09:25,002 --> 00:09:30,191
side of this inequality. Is also a random
variable. That's because the X J's are

130
00:09:30,191 --> 00:09:35,252
random variables. The number of sub
problems in phase j depends on which

131
00:09:35,252 --> 00:09:39,937
pivots get chosen. So. To analyze, what we
care about is the expectations of these

132
00:09:39,937 --> 00:09:44,260
quantities, their average values. So we're
gonna start modestly and as usual, this

133
00:09:44,422 --> 00:09:49,123
will extend our modest accomplishments to
much more impressive ones using linearity

134
00:09:49,123 --> 00:09:53,013
of expectation, but our first modest
goal is just to, to understand the

135
00:09:53,013 --> 00:09:57,078
average value. Of an XJ, the expected
value of XJ. We're gonna do that in two

136
00:09:57,078 --> 00:10:01,310
steps. On the next slide, I'm going to
argue that to analyze the expectation of

137
00:10:01,310 --> 00:10:05,649
XJ, it's sufficient to understand the
expectation of a very simple coin flipping

138
00:10:05,649 --> 00:10:09,989
experiment. Then, we'll analyze that coin
flipping experiment. Then we'll have the

139
00:10:09,989 --> 00:10:14,057
dominos all set up in a row. And on the
final slide, we'll knock'em down and

140
00:10:14,057 --> 00:10:18,398
finish the proof. So let's try to
understand the average number of recursive

141
00:10:18,398 --> 00:10:23,112
calls we expect to see in a given phase.
So, again, just so we don't forget. Xj is

142
00:10:23,112 --> 00:10:28,751
defined as the number of recursive calls
during phase J. Where a recursive call is

143
00:10:28,751 --> 00:10:34,321
in phase J, if and only if the current sub
array length lies between three-fourths

144
00:10:34,321 --> 00:10:39,547
raised to the J+1 times N. And then, the
larger number of three-fourths raised to

145
00:10:39,547 --> 00:10:44,320
the J times N. So again, for example,
phase zero is just the recursive calls

146
00:10:44,320 --> 00:10:48,856
under which the array length is between 75
percent of the original element and 100

147
00:10:48,856 --> 00:10:53,331
percent of the original elements. So what
I wanna do next is point out that a very

148
00:10:53,331 --> 00:10:58,291
simple sufficient condition guarantees
that we'll proceed from a given phase onto

149
00:10:58,291 --> 00:11:02,827
the next phase. So it's a condition
guaranteeing termination of the current

150
00:11:02,827 --> 00:11:07,423
phase. And it's an event that we've
discussed in previous videos. Mainly that

151
00:11:07,423 --> 00:11:11,536
the pivot that we choose gives a
reasonably balanced split. 25-75 or

152
00:11:11,536 --> 00:11:16,671
better. So recall how partitioning works,
we choose a pivot P. It winds up wherever

153
00:11:16,671 --> 00:11:21,269
it winds up. And the stuff to the left of
it's less than P. The stuff to the right

154
00:11:21,269 --> 00:11:25,752
of it is bigger than P. So 25 to 75 split
or better, what I mean is that each of

155
00:11:25,752 --> 00:11:29,612
these, each, the first part and the second
part has, at most, 75 percent of the

156
00:11:29,612 --> 00:11:33,925
elements in the input array. Both have
twen-, both have at least 25%, and, at

157
00:11:33,925 --> 00:11:38,591
most, 65%. And the key point is, that if
we wind up choosing a pivot that gives us

158
00:11:38,591 --> 00:11:42,863
a split that's at least as good the
current phase must end. Why must the

159
00:11:42,863 --> 00:11:47,609
current phase end? Well, to get a 25, 75
split or better than no matter which case

160
00:11:47,609 --> 00:11:52,354
we wind up in, in the algorithm we're
guaranteed to recurse on a sub problem that

161
00:11:52,354 --> 00:11:57,278
has at most 75 percent of what we started
with. That guarantees that whatever phase

162
00:11:57,278 --> 00:12:02,206
we're in now, we're going to be in an even
bigger phase when we recursed. Now, I want

163
00:12:02,206 --> 00:12:07,038
you to remember something that we talked
about before, which is that you've got a

164
00:12:07,038 --> 00:12:12,050
decent chance when you pick a random pivot
of getting something that gives you a 25,

165
00:12:12,050 --> 00:12:16,226
75 split or better. In fact, the
probability is 50 percent. Right? If you

166
00:12:16,226 --> 00:12:21,118
have an array that has the integers from
one to 100 inclusive, anything from 76 to

167
00:12:21,118 --> 00:12:26,248
s, 26 to 75 will do the trick. That'll
insure that at least the first 25 elements

168
00:12:26,248 --> 00:12:30,782
are excluded from the rightmost call and
at least rightmost 25 elements are

169
00:12:30,782 --> 00:12:36,089
excluded from the left recursive call. So
this is why we can reduce our analysis of

170
00:12:36,089 --> 00:12:41,088
the number of recursive calls during a
given phase, to a simple experiment

171
00:12:41,088 --> 00:12:46,290
involving flipping coins. Specifically,
the expected number of recursive calls.

172
00:12:46,290 --> 00:12:51,113
Now we are gonna see in a given phase J,
is no more than the expected number of

173
00:12:51,113 --> 00:12:55,570
coin flips in the following experiment.
Okay, so you've got a fair coin, 50

174
00:12:55,570 --> 00:12:59,172
percent heads, 50 percent tails. You
commit to flipping it until you see the

175
00:12:59,172 --> 00:13:04,179
head and the question is, how many coin
flips does it take up to and including the

176
00:13:04,179 --> 00:13:09,064
first head that you see? So the minimum
it's gonna be one coin flip if you hit a

177
00:13:09,064 --> 00:13:14,192
head the first time it's one. If you get a
tails and then a head, then it's two. If

178
00:13:14,192 --> 00:13:19,138
it's tails, tails, head it's three and so
on, and you always stop when you hit that

179
00:13:19,138 --> 00:13:23,764
first head. So what's the correspondence?
Well, think of heads as being you're in

180
00:13:23,764 --> 00:13:28,358
Phase J, and if you get a good pivot, it
gives you a 25/75 split. Call that heads.

181
00:13:28,358 --> 00:13:33,301
And it guarantees that you exit this Phase
J. Just like it guarantees that you get to

182
00:13:33,301 --> 00:13:37,895
terminate the coin flipping experience,
experiment. Now, if you get a pivot which

183
00:13:37,895 --> 00:13:42,663
doesn't give you a 25/75 split, you may or
may not pass to a higher Phase J, but in

184
00:13:42,663 --> 00:13:47,315
the worst case, you don't. You stick to
phase J is you get a bad split, and that's

185
00:13:47,315 --> 00:13:51,910
like getting a tails in the coin flipping
experiment, and you have to try again.

186
00:13:51,910 --> 00:13:56,154
This correspondence give us a very
elementary way to think about the progress

187
00:13:56,154 --> 00:13:59,964
that, that our randomized selection
algorithm is making. So, there's one

188
00:13:59,964 --> 00:14:04,263
recursive call in every step in our
algorithm, and each time we either choose

189
00:14:04,263 --> 00:14:08,453
a good pivot or a bad pivot, both could
happen, 50-50 probability. A good pivot

190
00:14:08,453 --> 00:14:12,698
means we get a 75-25 split or better. A
bad pivot means, by definition, we get a

191
00:14:12,698 --> 00:14:17,308
split worse than 25-75. So what have we
accomplished? We've reduced the task of

192
00:14:17,308 --> 00:14:21,581
upper bounding the expected number of
recursive calls in a phase J to

193
00:14:21,581 --> 00:14:26,649
understanding the expected number of times
you have to flip a fair coin before you

194
00:14:26,649 --> 00:14:31,595
get one hit. So on the next slide we'll
give you the classical and precise answer

195
00:14:31,595 --> 00:14:35,775
to this coin flipping experiment. So, let
me use capital N to denote the random

196
00:14:35,775 --> 00:14:39,433
variable, which we were just talking
about, the number of coin flips you need

197
00:14:39,433 --> 00:14:43,283
to do before you see the first heads. And,
it's not very important, but you should

198
00:14:43,283 --> 00:14:47,085
know that these random variables have
their own name. This would be a geometric

199
00:14:47,085 --> 00:14:51,263
random variable with parameter one-half.
So you can use a few different methods to

200
00:14:51,263 --> 00:14:55,608
compute the expected value of a geometric
random variable such as this, and brute

201
00:14:55,608 --> 00:14:59,899
force using the definition of expectation
works fine as long as you know how to

202
00:15:00,060 --> 00:15:04,674
manipulate infinite sums. But for the sake
of variety, let me give you a very sneaky

203
00:15:04,674 --> 00:15:08,751
proof of what it's expectation is. So the
sneaky approach is to write to the

204
00:15:08,751 --> 00:15:13,096
expected value of this random variable in
terms of itself and then solve for the

205
00:15:13,096 --> 00:15:17,173
unknown, solve for the expectation. So
let's think about it. So how many coins

206
00:15:17,173 --> 00:15:21,455
flips do you need? Well for sure you're
gonna need one. That's the best case

207
00:15:21,455 --> 00:15:26,467
scenario. And now two things can happen,
either you get heads and that has 50

208
00:15:26,467 --> 00:15:30,358
percent probability you stop or you get
tails that happens with 50 percent

209
00:15:30,358 --> 00:15:35,700
probability and now you start all over
again. Again you just put points until you

210
00:15:35,700 --> 00:15:41,174
get first heads. On average how many times
does that take. Well by the definition of

211
00:15:41,174 --> 00:15:46,400
capital N you expect. The expectation of N
coin flips, in the case where you get

212
00:15:46,400 --> 00:15:51,400
tails, and you have to start all over. So
this one represents the first coin flip,

213
00:15:51,400 --> 00:15:56,086
the one-half is the probability that you
can't stop, that you have to start all

214
00:15:56,086 --> 00:16:00,356
over probability of tails, and then
because it's a memory less process,

215
00:16:00,356 --> 00:16:05,338
because when you start anew on the second
coin flip having gotten the tails, it's as

216
00:16:05,338 --> 00:16:10,250
if you're back at time one all over again.
So now we have a trivial equation, in

217
00:16:10,250 --> 00:16:15,614
terms of the unknown expected value of N
and the unique solution, the unique value,

218
00:16:15,614 --> 00:16:20,913
that the expected value of capital N could
have, in light of this equation, is two.

219
00:16:20,913 --> 00:16:26,277
So, on average if you flip a fair coin and
stop when you get heads, you're going to

220
00:16:26,277 --> 00:16:30,902
see two coin flips on average. To make
sure you haven't sort of lost the forest

221
00:16:30,902 --> 00:16:35,085
for the trees, let me remind you why we
were talking about this coin flipping

222
00:16:35,085 --> 00:16:39,540
analysis in the first place. So recall in
the previous slide we showed that XJ, and

223
00:16:39,540 --> 00:16:43,996
remember XJ is the number of recursive
calls you'd expect to see in a given phase

224
00:16:43,996 --> 00:16:48,234
J, and we argued that the number of
recursive calls you're gonna see is bounded

225
00:16:48,234 --> 00:16:52,434
above. By the expected number of coin
flips until the heads. So this exact

226
00:16:52,434 --> 00:16:56,886
calculation of two for the coin flips
gives us an upper bound of two for the

227
00:16:56,886 --> 00:17:01,512
number of recursive calls on average in
any given phase J. So now that we've got

228
00:17:01,512 --> 00:17:06,080
all our ducks lined up in a row, let's
wrap up the proof on this final slide. So,

229
00:17:06,080 --> 00:17:10,664
inherited from part one of the proof, we
have an upper bound. On the expected

230
00:17:10,664 --> 00:17:16,023
running time. Of the R select algorithm.
This is what we were calling star on the

231
00:17:16,023 --> 00:17:21,165
first brief slide In star, it looked a
little messy, but we had the sum over the

232
00:17:21,165 --> 00:17:25,979
phases J. But we had two things that were
independent of j: the constant c and the

233
00:17:25,979 --> 00:17:30,846
original input length n, so let me just
yank the c and the n out front. And then

234
00:17:30,846 --> 00:17:36,209
we have this residual sum over the phases
J. Of three quarters raised to the J

235
00:17:36,209 --> 00:17:40,926
remember that comes from our upper bound
on the sub problem size during phase J and

236
00:17:40,926 --> 00:17:45,305
then of course we have to keep track of
how many phase J sub problems we have

237
00:17:45,305 --> 00:17:50,303
solved that by definition is XJ. Now star
was written as a rand in accordance terms

238
00:17:50,303 --> 00:17:54,739
to the random variables. Now we're gonna
go ahead and take the expectations and

239
00:17:54,739 --> 00:17:59,400
again I have said this over and over but
don't forget where's the expectation come

240
00:17:59,400 --> 00:18:03,162
from. This is over the random pivot
choices that our code makes. So the

241
00:18:03,162 --> 00:18:07,654
expected running time of the algorithm
is most the expectation of this start

242
00:18:07,654 --> 00:18:11,424
quantity. So like I said earlier, pretty
much every time we're gonna do any

243
00:18:11,424 --> 00:18:14,989
analysis of [inaudible] process, we're
gonna wind up using linearity of

244
00:18:14,989 --> 00:18:19,438
expectation at some point. Here is where
we do it. Linear expectation says the

245
00:18:19,438 --> 00:18:25,261
expectation of a sum is just the sum of
the expectations. So we yank the c and the

246
00:18:25,261 --> 00:18:30,500
n outside of the expectation. We yank this
sum over phases. Outside of the

247
00:18:30,500 --> 00:18:36,628
expectation. We yank this three-fourths
raised to the J outside of the expectation

248
00:18:36,628 --> 00:18:42,532
and then we just have the expected value
of XJ, the average number of recursive

249
00:18:42,532 --> 00:18:48,113
calls we expect to see in HJ. Now on the
previous two slides, we figured out an

250
00:18:48,113 --> 00:18:53,896
upper bound on how many recursive calls we
expect to see in each phase. So first by

251
00:18:53,896 --> 00:18:59,330
the coin flip analysis, by the reduction
of the coin flip analysis, this is the

252
00:18:59,330 --> 00:19:04,833
most expected number of coin flips N,
which on the previous slide, we argued was

253
00:19:04,833 --> 00:19:10,958
exactly two. So bringing that two out in
front of the sum, that no longer depends

254
00:19:10,958 --> 00:19:16,900
on J. So we get a most 2CN. Times the sum
over phases J, of three quarters raised to

255
00:19:16,900 --> 00:19:21,567
the J. Now this kind of sum we have seen
previously in the course. It came up when

256
00:19:21,567 --> 00:19:26,267
we were analyzing the master method and we
summed up our running time upper bounds

257
00:19:26,267 --> 00:19:30,797
over the levels of our recursin tree.
And if we're not in case one if we're in

258
00:19:30,797 --> 00:19:35,101
case two or three we had geometric sums
that were nontrivial. They require a

259
00:19:35,101 --> 00:19:39,631
certain formula to calculate, so let me
remind you of that formula here, when the

260
00:19:39,631 --> 00:19:44,831
three quarters are being powered up to the
J. So this has value at most, one over one

261
00:19:44,831 --> 00:19:50,269
minus, the number that's getting powered,
so in this case it's three quarters, one

262
00:19:50,269 --> 00:19:55,766
minus three quarters is a quarter check
reciprocal, you got four. And the upshot

263
00:19:55,766 --> 00:20:02,108
is that the expected number of operations
that this randomized selection algorithm

264
00:20:02,108 --> 00:20:08,145
uses to find the [inaudible] ordered
statistic in a given input array, is eight

265
00:20:08,145 --> 00:20:14,258
times C times N. Where C is the, hidden
constant in the linear running time of

266
00:20:14,258 --> 00:20:19,989
partition. And so that completes the
proof. The input array was arbitrary. We

267
00:20:19,989 --> 00:20:26,407
showed the expected running time over the
random choices of the algorithm is linear

268
00:20:26,407 --> 00:20:32,263
in N. That is, only a constant factor
larger than what is required to read the

269
00:20:32,263 --> 00:20:33,860
input. Pretty amazing.