1
00:00:01,760 --> 00:00:06,535
So now that we've done some work with our 
thought experiment, understanding exactly 

2
00:00:06,535 --> 00:00:11,080
what the optimal solution, the maximum 
rate independent set had to look like in 

3
00:00:11,080 --> 00:00:15,680
the path graph, let's turn that work into 
a linear time algorithm. 

4
00:00:15,680 --> 00:00:19,810
Let me quickly remind you, the bottom 
line from the previous video. 

5
00:00:19,810 --> 00:00:22,773
So we argued two things. 
First of all, if the max weight 

6
00:00:22,773 --> 00:00:27,084
independent set of a path graph happens 
to exclude the rightmost vertex, then it 

7
00:00:27,084 --> 00:00:30,963
has to in fact be a max weight 
independent set with the smaller graph g 

8
00:00:30,963 --> 00:00:34,142
prime, obtained from g by plucking off 
the rightmost vertex. 

9
00:00:34,142 --> 00:00:37,752
If on the other hand a max weight 
independent set does include the 

10
00:00:37,752 --> 00:00:41,524
rightmost vertex v sub n, then if you 
take out v sub n and look at the 

11
00:00:41,524 --> 00:00:45,512
remaining part of the independent set, 
that has to be max weight among the 

12
00:00:45,512 --> 00:00:49,822
smaller graph g double prime, defined by 
plucking the two rightmost vertices off 

13
00:00:49,822 --> 00:00:53,163
of the original graph g. 
Ergo, if we happen to know, if a little 

14
00:00:53,163 --> 00:00:57,556
birdie told us which of these two cases 
we were in we could recursively compute 

15
00:00:57,556 --> 00:01:01,611
the optimal solution of either G prime or 
G double prime as appropriate and be 

16
00:01:01,611 --> 00:01:03,920
done. 
If we recurse on G prime we just return 

17
00:01:03,920 --> 00:01:06,281
the result. 
If we recurse on G double prime we 

18
00:01:06,281 --> 00:01:08,540
augment it by V sub N and return the 
result. 

19
00:01:08,540 --> 00:01:12,200
Now, there is no little birdie, we don't 
know which of the two cases, we're in. 

20
00:01:12,200 --> 00:01:15,957
So we're concluded the previous video by 
proposing just trying both case. 

21
00:01:15,957 --> 00:01:19,811
So let's write down what that proposed 
algorithm would look like before we take 

22
00:01:19,811 --> 00:01:31,127
a step back and critique it. 
So, the good news about this algorithm is 

23
00:01:31,127 --> 00:01:33,622
it is correct. 
It's guaranteed to return the maximum 

24
00:01:33,622 --> 00:01:36,549
weight independence set. 
So, how would you prove this formally? 

25
00:01:36,549 --> 00:01:39,860
Well, it would be by induction, 
in exactly the same way we proceeded with 

26
00:01:39,860 --> 00:01:43,122
divide and conquer algorithms. 
And for the same reason, I'm not going to 

27
00:01:43,122 --> 00:01:46,433
talk about the details here. 
If you're interested, I'll leave it as an 

28
00:01:46,433 --> 00:01:48,592
optional exercise to write this out, 
formally. 

29
00:01:48,592 --> 00:01:52,479
But intuitively, the inductive hypothesis 
guarantees correctness of our recursive 

30
00:01:52,479 --> 00:01:54,734
calls. 
So computing the maximum weight solution 

31
00:01:54,734 --> 00:01:58,141
in either G prime or G double prime, 
and then, in the previous video, the 

32
00:01:58,141 --> 00:02:01,677
whole point of that, in effect, was 
arguing the correctness of our inductive 

33
00:02:01,677 --> 00:02:06,217
step, given the correct solution to the 
sub-problem, we argued what has to be the 

34
00:02:06,217 --> 00:02:09,543
optimal way to extend it to a solution, 
to the original graph, G. 

35
00:02:09,543 --> 00:02:13,423
The bad news, on the other hand, is that 
this algorithm takes exponential time. 

36
00:02:13,423 --> 00:02:15,993
It's essentially no better than brute 
force search. 

37
00:02:15,993 --> 00:02:20,276
So while the correctness of this kind of 
recursive algorithm, follows the template 

38
00:02:20,276 --> 00:02:22,393
of divide and conquer pretty much 
exactly. 

39
00:02:22,393 --> 00:02:24,761
the running time is blown up to 
exponential. 

40
00:02:24,761 --> 00:02:28,793
And the reason for that difference is, in 
our divide an conquer algorithms, think 

41
00:02:28,793 --> 00:02:32,471
of Merge Sort as a canonical example, 
we made a ton of progress before we 

42
00:02:32,471 --> 00:02:33,227
recursed. 
Right? 

43
00:02:33,227 --> 00:02:36,932
We threw out half of the array, 
50% of the stuff before we bothered with 

44
00:02:36,932 --> 00:02:39,824
any recursion. 
How much progress are we making in this 

45
00:02:39,824 --> 00:02:41,263
algorithm? 
Well, very little. 

46
00:02:41,263 --> 00:02:45,407
It's positive progress, but very small. 
We throw out either one or two vertices 

47
00:02:45,407 --> 00:02:49,285
out of maybe this graph with say a 
million vertices before recursing. 

48
00:02:49,285 --> 00:02:53,483
So we're branching by a factor two and 
making very little progress before each 

49
00:02:53,483 --> 00:02:55,767
branch. 
That's why we give this exponential 

50
00:02:55,767 --> 00:02:59,540
running time rather than something more 
in the neighborhood of n log n. 

51
00:02:59,540 --> 00:03:02,029
So this brings us to the following 
question. 

52
00:03:02,029 --> 00:03:05,933
This is an important question. 
I want you to think about it carefully 

53
00:03:05,933 --> 00:03:08,875
before you respond. 
So we have this exponential time 

54
00:03:08,875 --> 00:03:12,270
algorithm, it makes an exponential number 
of recursive calls. 

55
00:03:12,270 --> 00:03:16,909
Each recursive call is handed some graph, 
for which it's responsible for computing 

56
00:03:16,909 --> 00:03:21,153
the maximum-weight independence set. 
The question is this. Over all of these 

57
00:03:21,153 --> 00:03:24,887
exponentially many different 
sub-problems, each of which is passed 

58
00:03:24,887 --> 00:03:27,490
some graph as a sub-problem, how many 
distinct, 

59
00:03:27,490 --> 00:03:33,913
How many fundamentally different sub 
problems are ever considered across these 

60
00:03:33,913 --> 00:03:44,016
exponentially many recursive calls? 
So the answer to this question, and the 

61
00:03:44,016 --> 00:03:51,260
key to unlock the crazy efficiency of our 
dynamic programming implementation, is B. 

62
00:03:51,260 --> 00:03:56,313
So despite the fact that there's an 
exponential number of recursive calls, we 

63
00:03:56,313 --> 00:04:00,053
only ever solve a linear number of 
distinct sub-problems. 

64
00:04:00,053 --> 00:04:05,369
In fact, we can explicitly say what are 
the different sub-problems it could solve 

65
00:04:05,369 --> 00:04:09,110
throughout the recursion. 
What happens before you recurse? 

66
00:04:09,110 --> 00:04:13,355
Well you pluck vertices from the graph 
you were given off from the right. 

67
00:04:13,355 --> 00:04:17,715
Maybe you pluck off one vertex that's in 
the first recursive call, or in the 

68
00:04:17,715 --> 00:04:21,904
second recursive call you pluck off two 
vertices, but both from the right. 

69
00:04:21,904 --> 00:04:26,608
So an invariant maintains throughout the 
recursion is that the sub-problem you're 

70
00:04:26,608 --> 00:04:30,854
handed was obtained by successive 
plucking off of vertices from the right 

71
00:04:30,854 --> 00:04:34,009
part of the graph, from the end of the, 
end of the graph. 

72
00:04:34,009 --> 00:04:38,427
So, however you got to where you are, 
whatever the sequence of recursive calls 

73
00:04:38,427 --> 00:04:42,615
led to where you are now, you are 
guaranteed to be handed a prefix of the 

74
00:04:42,615 --> 00:04:44,831
graph. 
The graph induced by the first I 

75
00:04:44,831 --> 00:04:48,121
vertices, where I is some number between 
zero and n. 

76
00:04:48,121 --> 00:04:53,347
So therefore there's only a linear number 
of sub-problems you ever have to worry 

77
00:04:53,347 --> 00:04:56,380
about, the prefixes of the original input 
graph. 

78
00:04:56,380 --> 00:05:02,153
From this, we conclude that the 
exponential running time of the previous 

79
00:05:02,153 --> 00:05:08,166
algorithm arises solely from the 
spectacular redundancy of solving exactly 

80
00:05:08,166 --> 00:05:13,940
the same sub-problem from scratch, over 
and over and over and over again. 

81
00:05:13,940 --> 00:05:18,361
So this observation offers up the promise 
of a linear time implementation of this 

82
00:05:18,361 --> 00:05:20,896
algorithm. 
After all, there's no need to solve a 

83
00:05:20,896 --> 00:05:24,508
sub-problem more than once. 
Once you've solved it once you know the 

84
00:05:24,508 --> 00:05:26,773
answer. 
So an obvious way to speed up this 

85
00:05:26,773 --> 00:05:30,763
algorithm, to speed it up dramatically is 
to simply cache the results of a 

86
00:05:30,763 --> 00:05:34,862
sub-problem the first time you see it. 
Then you can look it up in some array, 

87
00:05:34,862 --> 00:05:38,620
constant time, at any point later on in 
the algorithm. 

88
00:05:38,620 --> 00:05:42,463
There is a word for this, I won't really 
use it in this class, but just so you 

89
00:05:42,463 --> 00:05:44,710
that know what it is, it's called 
memoization. 

90
00:05:44,710 --> 00:05:48,679
So in case this is a little vague, what I 
mean is you would have some array, some 

91
00:05:48,679 --> 00:05:52,698
global array, indexed one to N or maybe 
zero to N with the semantics that the Ith 

92
00:05:52,698 --> 00:05:56,271
entry of this array, is the value of the 
solution of the Ith sub-problem. 

93
00:05:56,271 --> 00:06:00,190
Now when you do a recursive call and 
you're handed the first I vertices of the 

94
00:06:00,190 --> 00:06:04,160
graph, and again remember that we know 
that the sub-problem has to look like the 

95
00:06:04,160 --> 00:06:08,179
first I vertices of the graph for sub I. 
You check the array, if it's already been 

96
00:06:08,179 --> 00:06:10,511
filled in, if you already know the 
answer, great. 

97
00:06:10,511 --> 00:06:13,786
You just return it and count the time, 
you don't bother to resolve. 

98
00:06:13,786 --> 00:06:15,920
If this is the first time you've ever 
seen. 

99
00:06:15,920 --> 00:06:19,143
this sub problem then you recurse and you 
solve it, 

100
00:06:19,143 --> 00:06:21,950
as as we saw, 
as we suggested in the previous slot. 

101
00:06:21,950 --> 00:06:26,307
So with this simple memoization fixed, 
this action, this algorithm is linear 

102
00:06:26,307 --> 00:06:28,980
time. 
The easiest way to see that, and actually 

103
00:06:28,980 --> 00:06:33,337
in fact a better implementation, is to go 
away from this recursive top down 

104
00:06:33,337 --> 00:06:36,998
paradigm, and instead implement the 
solution in a bottom up way. 

105
00:06:36,998 --> 00:06:41,530
So solving sub problems in a principled 
way from the smallest to the original 

106
00:06:41,530 --> 00:06:46,862
one, the biggest. 
So a little bit more precisely, 

107
00:06:46,862 --> 00:06:51,832
let me use the notation G sub I to denote 
the sub graph of the original graph, 

108
00:06:51,832 --> 00:06:56,417
consisting of the first I vertices. 
So we're going to again going to ha-, 

109
00:06:56,417 --> 00:07:00,221
going to have an array with the same 
semantics as in the recursive version. 

110
00:07:00,221 --> 00:07:03,285
The Ith entry denotes the solution to the 
Ith sub-problem. 

111
00:07:03,285 --> 00:07:07,565
That is the max rate independent set of G 
sub I, and the plan is to populate that 

112
00:07:07,565 --> 00:07:15,281
bottom up, that is from left to right. 
So we'll handle the edge cases of the 

113
00:07:15,281 --> 00:07:19,349
first two entries of, of this array 
specially G sub zero would be the empty 

114
00:07:19,349 --> 00:07:22,929
graph, so there's no independent set. 
So lets define the max weight, 

115
00:07:22,929 --> 00:07:25,315
independent set of the map graph, to be 
zero. 

116
00:07:25,315 --> 00:07:29,654
And, if you graph in G one, where the 
only vertex is v one, the max weight 

117
00:07:29,654 --> 00:07:32,095
independent set is obviously that one 
vertex. 

118
00:07:32,095 --> 00:07:40,244
Remember weights are not negative. 
So our main four loop just builds up 

119
00:07:40,244 --> 00:07:44,659
solutions to all of the sub-problems in a 
systematic way, going from smallest 

120
00:07:44,659 --> 00:07:48,386
graph, G sub two up to the biggest graph, 
the original one, G sub n. 

121
00:07:48,386 --> 00:07:52,974
And when we consider the graph G sub I, 
how do we figure out what the max weight 

122
00:07:52,974 --> 00:07:57,217
independence set is of that graph? 
Well now we use, completely directly, the 

123
00:07:57,217 --> 00:08:01,460
work that we put in the previous video. 
The previous video told us what an 

124
00:08:01,460 --> 00:08:05,878
optimal solution has to look like. 
And it has to have one of two forms. 

125
00:08:05,878 --> 00:08:11,070
We know, we proved, either, the maximum 
independent set of G sub I. 

126
00:08:11,070 --> 00:08:16,004
Excludes the last vertex V sub I, and 
then is merely an optimal solution of the 

127
00:08:16,004 --> 00:08:19,939
graph G sub I minus one. 
If it's not that, there is nothing else 

128
00:08:19,939 --> 00:08:23,375
it can be, other than including the last 
vertex V sub I. 

129
00:08:23,375 --> 00:08:27,872
And being an optimal max weighted 
independent set for the graph G sub I 

130
00:08:27,872 --> 00:08:30,745
minus two. 
We know its one of those two things. 

131
00:08:30,745 --> 00:08:34,431
We don't know which. 
We do brute force search among the two 

132
00:08:34,431 --> 00:08:37,741
possibilities, 
and that gives us the optimal solution 

133
00:08:37,741 --> 00:08:41,739
for the Ith sub problem. 
Crucially, when we need to do this brute 

134
00:08:41,739 --> 00:08:45,050
force search for the Ith sub problem, we 
already know. 

135
00:08:45,050 --> 00:08:49,281
We've already computed the optimal 
solutions to the smaller sub problems 

136
00:08:49,281 --> 00:08:53,701
that are relevant, those can be looked up 
in constant time, and that's what makes 

137
00:08:53,701 --> 00:08:56,890
this, each iteration of this four loop 
run in constant time. 

138
00:08:56,890 --> 00:08:59,983
So we've done a fair amount of work to 
get to this point, 

139
00:08:59,983 --> 00:09:03,727
but, after seeing that the greedy 
algorithm design paradigm failed us. 

140
00:09:03,727 --> 00:09:07,200
The divide-and-conquer algorithm design 
paradigm was inadequate. 

141
00:09:07,200 --> 00:09:10,564
Brute force search is too slow. 
With this, as we'll see, dynamic 

142
00:09:10,564 --> 00:09:14,742
programming algorithm, we now have a one 
line solution, effectively, to the max 

143
00:09:14,742 --> 00:09:17,238
weight independent set problem in path 
graphs. 

144
00:09:17,238 --> 00:09:18,975
Pretty cool. 
What's the run time? 

145
00:09:18,975 --> 00:09:23,478
Well this is probably the easiest 
algorithm is run time we've ever had to 

146
00:09:23,478 --> 00:09:25,378
analyze. 
Obviously, it's linear time, 

147
00:09:25,378 --> 00:09:27,820
constant time per each iteration of the 
four loop. 

148
00:09:27,820 --> 00:09:30,877
Why is the algorithm correct? 
Well it's as same as our recursive 

149
00:09:30,877 --> 00:09:33,170
algorithm. 
It makes exactly the same decisions. 

150
00:09:33,170 --> 00:09:36,800
The only difference is it doesn't bother 
with the spectacular redundancy of 

151
00:09:36,800 --> 00:09:39,046
resolving sub-problems it's already 
solved. 

152
00:09:39,046 --> 00:09:42,437
Again if you wanted to prove it by 
scratch, it would just be a straight 

153
00:09:42,437 --> 00:09:46,211
forward induction, like in our divide and 
conquer algorithm, the recursive calls 

154
00:09:46,211 --> 00:09:49,985
are correct by the inductive hypothesis. 
The inductive step is justified by the 

155
00:09:49,985 --> 00:09:52,040
case analysis of the of the previous 
video.