1
00:00:00,000 --> 00:00:04,646
So we're almost at the finish line of our
analysis of quick sort. Let me remind you

2
00:00:04,646 --> 00:00:09,181
what we're proving. We're proving that for
the randomized implementation of quick

3
00:00:09,181 --> 00:00:13,604
sort where we always choose the pivot
element to partition around uniformly at

4
00:00:13,604 --> 00:00:17,523
random, we're showing that for every
array, every input of length N, the

5
00:00:17,523 --> 00:00:21,553
average running time of quick sort over
the random choices of pivots is

6
00:00:21,553 --> 00:00:26,032
[inaudible] of N log N. So we've done a
lot of work in the last couple of videos.

7
00:00:26,032 --> 00:00:33,757
Let me just remind you about the stories
so far. In the first video what we did is

8
00:00:33,757 --> 00:00:37,902
we identified the relevant random variable
that we cared about, capital C, the number

9
00:00:37,902 --> 00:00:41,998
of comparisons that Quicksort makes among
the pairs of elements in the input array.

10
00:00:41,998 --> 00:00:45,453
Then we applied the decomposition
approach. We expressed capital C, the

11
00:00:45,453 --> 00:00:49,400
overall number of comparisons, as a sum of
indicator or 0-1 random variables. For

12
00:00:49,400 --> 00:00:53,397
each of those variables XIJ, just counted
the number of comparisons involving the

13
00:00:53,397 --> 00:00:57,493
Ith smallest and Jth smallest entries in
the array, and that's gonna be either zero

14
00:00:57,493 --> 00:01:01,095
or one. Then we applied linearity of
expectation to realize, all we really

15
00:01:01,095 --> 00:01:05,043
needed to understand was the comparison
probabilities for different pairs of

16
00:01:05,043 --> 00:01:08,830
elements. [inaudible]. Second video we
nailed what that comparison probability

17
00:01:08,830 --> 00:01:12,759
is, specifically, for the I smallest and
the J smallest elements in the array, the

18
00:01:12,759 --> 00:01:16,786
probability that quick sort compares them
when you always make random [inaudible]

19
00:01:16,786 --> 00:01:21,982
choices is exactly. Two divided by the
quantity J minus I. Plus one. So putting

20
00:01:21,982 --> 00:01:26,692
that all together, yields the following
expression, governing the average number

21
00:01:26,692 --> 00:01:33,665
of comparisons made by quick sort. One
thing I want you to appreciate is, is in

22
00:01:33,665 --> 00:01:38,235
the last couple of videos, we've been sort
of amazingly exact as algorithmic analysis

23
00:01:38,235 --> 00:01:41,945
goes. Specifically we've done nothing
sloppy whatsoever. We've done no

24
00:01:41,945 --> 00:01:46,300
estimates. The number of comparisons that
quick store makes on average is exactly

25
00:01:46,300 --> 00:01:50,655
this double sum. Now surely we'll do some
inequalities to make our lives a little

26
00:01:50,655 --> 00:01:54,580
bit easier. But up to this point
everything has been completely exact. And

27
00:01:54,580 --> 00:01:58,828
this will actually see why there's small
constants in the, in the, in quick sort.

28
00:01:58,828 --> 00:02:05,136
It's basically going to be this factor
two. Now the next question to ask is, what

29
00:02:05,136 --> 00:02:08,928
are we shooting for? Remember the theorem
we want to prove is that the expected

30
00:02:08,928 --> 00:02:12,672
number of comparisons really the expected
run time is all of N log N, so we're

31
00:02:12,672 --> 00:02:16,368
already done. Well not quite we're gonna
have to be a little bit clever, so if

32
00:02:16,368 --> 00:02:20,304
we're looking at this double sum, and we
ask how big are the sum ends and how many

33
00:02:20,304 --> 00:02:24,288
terms are there? Well the biggest sum ends
we're ever going to see are when I and J

34
00:02:24,288 --> 00:02:28,080
are right next to each other when J is one
bigger than I, and in that case this

35
00:02:28,080 --> 00:02:31,824
fraction is gonna be one half. So the
terms can be as big as one half, how many

36
00:02:31,824 --> 00:02:36,664
terms are there? Well there's a quadratic
number of terms. So it would be very easy

37
00:02:36,664 --> 00:02:40,995
to derive an upper bound that's quadratic
in N, but that's not what we want. We want

38
00:02:40,995 --> 00:02:45,013
one that's N log N. So to drive that,
we're gonna have to be a little bit more

39
00:02:45,013 --> 00:02:49,292
clever about how we evaluate this sum. So,
the idea is, what we're going to do, is to

40
00:02:49,292 --> 00:02:53,362
think about a fixed value of I in this
outermost sum. And then we're gonna ask,

41
00:02:53,362 --> 00:03:00,740
how big could the inner sum be? So let's
fix some value of I, the value of the

42
00:03:00,740 --> 00:03:07,992
index in the outer sum. And then let's
look at the inner sum, where J ranges from

43
00:03:07,992 --> 00:03:13,340
I plus one up to N, and the value of the
sum end is one over the quantity J minus I

44
00:03:13,340 --> 00:03:18,369
plus one. So how big can this be? Well,
let's first understand what the terms

45
00:03:18,369 --> 00:03:23,272
actually are. So J starts at I plus one
and then it ascends to N. And as J gets

46
00:03:23,272 --> 00:03:28,427
bigger the denominator gets bigger. So the
sum ends get smaller. So the biggest sum

47
00:03:28,427 --> 00:03:33,394
end is gonna be the very first one. And J
is as small as possible. Namely I plus

48
00:03:33,394 --> 00:03:38,423
one. When J is I plus one the sum end is
one half. Then J gets incremented in the

49
00:03:38,423 --> 00:03:43,327
sum. And so that's, we're gonna pick up a
one third term followed by one fourth

50
00:03:43,327 --> 00:03:47,856
term, and so on. So there's gonna be, for
every inner sum is gonna have a this form,

51
00:03:47,856 --> 00:03:52,014
one-half plus one-half equals one-fourth.
And then it's gonna sort of run out at

52
00:03:52,014 --> 00:03:56,069
some point, when J equals N. And the
biggest term we're ever going to see is

53
00:03:56,069 --> 00:04:01,078
gonna be a one over N, in the case where I
equals one. So. Let's make our lives

54
00:04:01,078 --> 00:04:07,839
easier by taking this expression we
started with. Star, and instead of having

55
00:04:07,839 --> 00:04:14,945
a double sum, let's just upper bound this
with a single sum. So what are the

56
00:04:14,945 --> 00:04:21,008
ingredients of a single sum? Well, there's
this two, can't forget the two. Then

57
00:04:21,008 --> 00:04:26,727
there's N choices for I, actually, there's
N minus one choices for I, but let's just

58
00:04:26,727 --> 00:04:34,486
be sloppy and say N choices. So that gives
us a factor N. And then how big can an

59
00:04:34,486 --> 00:04:38,432
inner sum be? Well, inner sum is just a
bunch of these terms, one-half plus

60
00:04:38,432 --> 00:04:42,587
one-third and so on. The biggest of those
inner sums is the one occurring when I

61
00:04:42,587 --> 00:04:46,689
equals one, at W, at which point the last
term is one over N. So, we're gonna just

62
00:04:46,689 --> 00:04:50,740
do a change of variable and express the
inner [inaudible], upper bound on each

63
00:04:50,740 --> 00:04:56,214
inner sum as the sum from K equal two to N
of one over K. So that's looking more

64
00:04:56,214 --> 00:05:01,757
manageable just having the single sum
involving this index K, and life's gonna

65
00:05:01,757 --> 00:05:07,299
get really good when we prove the next
claim, which is that this sum cannot be

66
00:05:07,299 --> 00:05:13,198
very big, it's only logarithmic in N, even
though there's a linear number of sum N's,

67
00:05:13,198 --> 00:05:19,131
the overall value of the sum is only
logarithmic. That, of course, is gonna

68
00:05:19,131 --> 00:05:24,012
complete the proof, 'cause that'll give us
an overall bound of two times N times the

69
00:05:24,012 --> 00:05:28,776
natural log on N. So it's an N login bound
with really quite reasonable constants.

70
00:05:28,776 --> 00:05:33,422
So, why is this true? Why is this sum only
logarithmically large? Well, let's do a

71
00:05:33,422 --> 00:05:42,787
proof by a picture. I'm going to write
this sum. In a geometric fashion. So on

72
00:05:42,787 --> 00:05:47,500
the X axis, let me mark off points
corresponding to the positive integers.

73
00:05:47,880 --> 00:05:55,036
And on the Y axis, let me mark off points
corresponding to fractions of the form,

74
00:05:55,036 --> 00:06:02,959
one over K. And what I?m gonna do is gonna
draw a bunch of rectangles. Of decreasing

75
00:06:02,959 --> 00:06:07,876
area, specifically they all have with one,
and the heights are gonna be like one over

76
00:06:07,876 --> 00:06:12,618
K. So the area of this guy's one, the area
of this guy's one half, the area of this

77
00:06:12,618 --> 00:06:19,636
guy's one third, and so on. And now I'm
going to overlay on this picture the graph

78
00:06:19,636 --> 00:06:25,099
of the function, the continuous function,
F of X equals one over X. So notice that

79
00:06:25,099 --> 00:06:30,904
is going to go through these three points.
It's gonna kiss all of these rectangles on

80
00:06:30,904 --> 00:06:37,371
their upper right corners. Now what is it
we're trying to prove? The claim we're

81
00:06:37,371 --> 00:06:42,620
trying to prove is that this sum, one half
plus one third and so on, is upper bounded

82
00:06:42,620 --> 00:06:46,931
by something, so the sum can be just
thought of as the areas in these

83
00:06:46,931 --> 00:06:51,868
rectangles, the one half, the one third
and so on, and we're going to upper bound

84
00:06:51,868 --> 00:06:56,867
it by the area under the blue curve, if
you notice the area under the blue curve

85
00:06:56,867 --> 00:07:02,115
is at least as big as the sum of the areas
of the rectangles because the curve hits

86
00:07:02,115 --> 00:07:08,412
each of these rectangles in its north east
corner. So putting that into mathematics,

87
00:07:08,412 --> 00:07:14,504
the sum from K equal two to N of one over
K. Is met in above by the integral. And

88
00:07:14,504 --> 00:07:20,747
we'll start the area of the curve at one.
And then we need it to go all the way up

89
00:07:20,747 --> 00:07:26,839
to N. Of the function one over X. The X,
so that's the area under the curve. And if

90
00:07:26,839 --> 00:07:33,235
you remember a little bit of calculus the
integral of one over X is the natural log

91
00:07:33,235 --> 00:07:40,251
of X. So this equals the natural log of X.
Evaluated at one. Also known as login

92
00:07:40,251 --> 00:07:49,786
minus log one. And of course log one would
be zero, so that gives us our login. So

93
00:07:49,786 --> 00:07:58,504
that completes the proof of the claim.
That indeed, the sum of these one over K's

94
00:07:58,504 --> 00:08:04,684
is bounded above by the natural log of N,
and that in fact completes the proof of

95
00:08:04,684 --> 00:08:12,791
the theorem. You've got to be the expected
number of comparisons, at most two N times

96
00:08:12,791 --> 00:08:17,790
this sum, which is at most log N. And
altogether, we find that the expected

97
00:08:17,790 --> 00:08:22,490
number of comparisons that quick sort
makes on an arbitrary input of length N.

98
00:08:22,490 --> 00:08:26,994
Is two times N times the natural log of N.
So that would be big o of N, log N, with

99
00:08:26,994 --> 00:08:31,611
quite reasonable constants. Now, this is
just the number of comparisons, but as we

100
00:08:31,611 --> 00:08:36,060
observed earlier, the running time of
Quicksort on average is not much more than

101
00:08:36,060 --> 00:08:40,398
that, the running time is dominated by the
number of comparisons that it makes.

102
00:08:40,398 --> 00:08:44,347
Moreover, as we discussed when we were
talking about the details of the

103
00:08:44,347 --> 00:08:48,518
implementation, it works in place,
essentially no extra storage is necessary.

104
00:08:48,518 --> 00:08:52,633
So that is a complete and mathematically
rigorous explanation of just why

105
00:08:52,633 --> 00:08:55,500
Quicksort. Is so quick.