1
00:00:00,012 --> 00:00:04,344
Let's see how this substructure lemma 
naturally leads to a recursive search 

2
00:00:04,344 --> 00:00:06,661
algorithm. 
The algorithm is recursive. 

3
00:00:06,661 --> 00:00:10,967
I'm not going to bother to specify the 
base cases, which is when you're given a 

4
00:00:10,967 --> 00:00:14,731
graph, with a most 1 vertex. 
Or whether you're given a value of k, 

5
00:00:14,731 --> 00:00:17,664
equal to 0 or 1. 
So let's assume that the graph has a 

6
00:00:17,664 --> 00:00:20,802
least 2 vertices, and that k is at least 
equal to 2. 

7
00:00:20,802 --> 00:00:25,335
I wouldn't be surprised if this algorithm 
reminds you, of some of the recursive 

8
00:00:25,335 --> 00:00:29,655
implementations we discussed for various 
dynamic programming algorithms. 

9
00:00:29,655 --> 00:00:33,861
And in the same way those algorithms 
exhibited exponential running time, 

10
00:00:33,861 --> 00:00:38,252
without memoization, we're going to see 
an exponential running time here. 

11
00:00:38,252 --> 00:00:41,817
For all of our dynamic programming 
algorithms, we were able to cleverly 

12
00:00:41,817 --> 00:00:45,901
organize sub-problems from smallest to 
largest, thereby eliminating redundantly 

13
00:00:45,901 --> 00:00:49,898
solving the same sub-problems over and 
over again, yielding polynomial run time 

14
00:00:49,898 --> 00:00:52,092
bounds. 
Here, we're going to be stuck with the 

15
00:00:52,092 --> 00:00:56,040
exponential running time bound. 
Obviously not surprising given Given that 

16
00:00:56,040 --> 00:00:59,691
it's an NP complete problem, but still, 
if you really want to understand this 

17
00:00:59,691 --> 00:01:02,903
deeply, take this recursive search 
algorithm.Try to come up with a 

18
00:01:02,903 --> 00:01:06,762
polynomial time version,try to come up 
with a small set of sub problems, ordered 

19
00:01:06,762 --> 00:01:10,157
from smallest to largest that you can 
solve systematically.You will be 

20
00:01:10,157 --> 00:01:13,941
inevitably stymied in those attempts, but 
you'll better appreciate what's non 

21
00:01:13,941 --> 00:01:16,342
trivial about the recursive solution 
here. 

22
00:01:16,342 --> 00:01:20,248
In step 1 of the alrogithm, we pick some 
arbitray edge, u, comma v. 

23
00:01:20,248 --> 00:01:24,410
Why is it useful to focus on an edge? 
Well, by the definition of a vertex 

24
00:01:24,410 --> 00:01:28,291
cover, it has to include either u or v, 
so that's our initial clue. 

25
00:01:28,291 --> 00:01:32,929
We're going to proceed optimistically. 
We're looking for a vertex cover of size 

26
00:01:32,929 --> 00:01:37,016
K, so let's assume one exists. 
What does the sub structure limit tells 

27
00:01:37,016 --> 00:01:40,242
us? It says. 
That if such a solution exists, then 

28
00:01:40,242 --> 00:01:45,128
necessarily either gu or gv or both 
themselves have small vertex covers, of 

29
00:01:45,128 --> 00:01:48,790
size only k - 1. 
Moreover, it's a simple matter to extend 

30
00:01:48,790 --> 00:01:53,535
such vertex covers to a small 1 for g, 
you just augment them by the missing 

31
00:01:53,535 --> 00:01:56,399
vertex. 
So let's first adopt as a working 

32
00:01:56,399 --> 00:02:00,878
hypothesis that g sub u has a smaller 
disk number of size only k-1. 

33
00:02:00,878 --> 00:02:05,965
Let's recursively try to find it. 
If our recursive search comes back with a 

34
00:02:05,965 --> 00:02:10,292
solution of vertex number of size k-1 for 
g-u, we're good to go. 

35
00:02:10,292 --> 00:02:15,505
We just augment that by the vertex u and 
that's our vertex number of size k for 

36
00:02:15,505 --> 00:02:21,072
the original graph of G If that recursive 
call fails to find the small vertex cover 

37
00:02:21,072 --> 00:02:25,822
will you say, oh well then it must be 
that v is the key to getting a small 

38
00:02:25,822 --> 00:02:29,122
vertex cover. 
So we do exactly the same thing, we 

39
00:02:29,122 --> 00:02:33,038
recursively search g sub v for a vertex 
cover of size K - 1. 

40
00:02:33,038 --> 00:02:38,677
If the second recursive call also fails 
to find a small vertex cover, one of size 

41
00:02:38,677 --> 00:02:44,104
only k - 1, then by the substructure 
claim, the other direction, we know that 

42
00:02:44,104 --> 00:02:48,142
the original graph G cannot have a vertex 
cover of size k. 

43
00:02:48,142 --> 00:02:51,422
If it did, the substructure limit tells 
us. 

44
00:02:51,422 --> 00:02:54,486
One of the 2 recursive calls would have 
succeeded. 

45
00:02:54,486 --> 00:02:59,191
Since they both failed, we conclude 
correctly that the original graph did not 

46
00:02:59,191 --> 00:03:03,402
have a small vertex cover. 
The correctness analysis is straight for, 

47
00:03:03,402 --> 00:03:06,508
forward, formally you would proceed by 
induction. 

48
00:03:06,508 --> 00:03:10,962
So the induction hypothesis would then 
guarantee the correctness of the 2 

49
00:03:10,962 --> 00:03:15,675
recursive calls, so on the smaller, our 
sub problem is GU and GV the recursive 

50
00:03:15,675 --> 00:03:20,472
calls correctly detect whether or not 
there are vertex covers of size K-1. 

51
00:03:20,472 --> 00:03:24,927
The substructure limit guarantees that we 
correctly compile the results of the two 

52
00:03:24,927 --> 00:03:27,689
recursive calls into the output of our 
algorithms. 

53
00:03:27,689 --> 00:03:31,132
So if one of the two recursive calls 
succeeds, we're good to go. 

54
00:03:31,132 --> 00:03:35,560
We just output the vertex cover of size K 
as desired, and this is the crucial part, 

55
00:03:35,560 --> 00:03:39,827
if both of the recursive calls. 
Fail then by sub-structure liner we know, 

56
00:03:39,827 --> 00:03:43,557
we correctly report fail. 
We know there is not a vertex cover of 

57
00:03:43,557 --> 00:03:48,232
size k or less in the original graph G. 
So lets analyse the running time using an 

58
00:03:48,232 --> 00:03:51,627
Adhoc analysis. 
Lets think about the number of workers of 

59
00:03:51,627 --> 00:03:55,657
cost that there can be over the entire 
trajectory of the algorithm or 

60
00:03:55,657 --> 00:04:00,467
equivalently the number of nodes in the 
recursion tree generated by the algorithm 

61
00:04:00,467 --> 00:04:04,702
and then we'll come up with the bound on 
how much work is done on a given. 

62
00:04:04,702 --> 00:04:09,347
recursive of call, not counting the work 
done by its recursive subcalls. 

63
00:04:09,347 --> 00:04:13,543
And now, the cool part, and this is 
really where the rule of a small value of 

64
00:04:13,543 --> 00:04:17,922
k comes into play is that, each time we 
recurse, we subtract 1 from the target 

65
00:04:17,922 --> 00:04:20,490
size of a vertex cover that we're looking 
for. 

66
00:04:20,490 --> 00:04:23,524
Remember, if we are given, as input, the 
parameter k. 

67
00:04:23,524 --> 00:04:27,772
Each of our recursive calls involve. 
The parameter value of K minus 1. 

68
00:04:27,772 --> 00:04:32,792
So what this does is limit the recursion 
depth or the depth of the tree generated 

69
00:04:32,792 --> 00:04:37,512
by this algorithm to be at most K. 
Initially you start with K, each time you 

70
00:04:37,512 --> 00:04:42,442
recurse you decrement it by 1 and you're 
going to hit base cases by the time K is 

71
00:04:42,442 --> 00:04:45,962
around 1 or 0. 
Putting those two observations together, 

72
00:04:45,962 --> 00:04:49,642
branching factor banded by two, recursion 
depth bounded by K. 

73
00:04:49,642 --> 00:04:53,917
The number of recursive calls in total, 
over the entire life time of this 

74
00:04:53,917 --> 00:04:56,687
algorithm, is going to be bounded by two 
to the K. 

75
00:04:56,687 --> 00:05:01,427
Moreover, if you inspect the pseudo code, 
you see that not a whole lot of actions 

76
00:05:01,427 --> 00:05:04,147
going on, other than the recursive sub 
calls. 

77
00:05:04,147 --> 00:05:08,152
So even with a very sloppy 
implementation, where you have construct 

78
00:05:08,152 --> 00:05:11,047
a G sub U, or a G sub V from the input 
graph g. 

79
00:05:11,047 --> 00:05:15,552
Even if you do that crudely, you're going 
to get a linear time bound big o of m 

80
00:05:15,552 --> 00:05:19,982
work, per recursive call, not counting 
the work that gets done later by the 

81
00:05:19,982 --> 00:05:23,467
recursive sub-calls. 
So, that means an upward bound on the 

82
00:05:23,467 --> 00:05:27,562
total amount of work done by this 
algorithm is the number of recursive 

83
00:05:27,562 --> 00:05:30,702
calls, 2 ^ k, times the work per call. 
M 2^K*M. 

84
00:05:30,702 --> 00:05:37,182
Now this running time you'll of course 
notice still exponential in K, but, of 

85
00:05:37,182 --> 00:05:42,637
course it has to be exponential unless we 
intend on proving P=NP. 

86
00:05:42,637 --> 00:05:47,212
Don't forget the vertex cover problem is 
NP complete. 

87
00:05:47,212 --> 00:05:51,338
So what have we accomplished? We had a 
trivial exponential time algorithm by 

88
00:05:51,338 --> 00:05:54,992
brute force search earlier. 
Why did we go to this rigamorole to come 

89
00:05:54,992 --> 00:05:58,596
up with the second algorithm with this 
exponential run time bound. 

90
00:05:58,596 --> 00:06:02,663
Well, this is a qualitatively better 
running time bound than we have before 

91
00:06:02,663 --> 00:06:06,317
with naive brute force search. 
Here's 2 ways to see why this running 

92
00:06:06,317 --> 00:06:11,052
time bound is qualitatively superior So 
first, let's just inspect this running 

93
00:06:11,052 --> 00:06:15,783
time from a mathematical perspective and 
let's ask how large can we take K while 

94
00:06:15,783 --> 00:06:20,332
still having a polynomial time algorithm. 
Recall that our naive brute force 

95
00:06:20,332 --> 00:06:24,840
searched algorithm where you just try 
each sub set of K vertices random time 

96
00:06:24,840 --> 00:06:28,449
theta of n to the k. 
And so that's polynomials Time if k is 

97
00:06:28,449 --> 00:06:33,862
constant, and it's super-polynomial time 
if k is bigger than a constant. 

98
00:06:33,862 --> 00:06:39,309
For this running time bound 2 ^ k * m, we 
can let k grow as large as log of the 

99
00:06:39,309 --> 00:06:44,307
graph size and still have a polynomial 
Algorithm, So now we have the sig way 

100
00:06:44,307 --> 00:06:48,562
from these mathematical balance to 
thinking through actually implementing 

101
00:06:48,562 --> 00:06:51,180
these algorithms running them on actual 
data. 

102
00:06:51,180 --> 00:06:55,412
So for the naive per search you have got 
this running time of of n to the k so 

103
00:06:55,412 --> 00:06:59,346
even for a relatively small graph you 
know lets say n is may be fifty or 

104
00:06:59,346 --> 00:07:03,674
hundred you are not going to be able, 
You're not going to be able to run this 

105
00:07:03,674 --> 00:07:08,258
brute for-, naive brute force search 
algorithm except for extremely small 

106
00:07:08,258 --> 00:07:10,920
values of k. 
3, 4, maybe 5 if you're lucky. 

107
00:07:10,920 --> 00:07:15,322
So naive brute force search has very 
limited range of applicability. 

108
00:07:15,322 --> 00:07:19,684
So by contrast our smartest risk 
algorithm can accommodate a noticeably 

109
00:07:19,684 --> 00:07:23,384
larger range of k's. 
So remember the running time here is 2 to 

110
00:07:23,384 --> 00:07:27,996
the k * the graph size, and so even for 
values of k up to 20, for many networks 

111
00:07:27,996 --> 00:07:32,525
of interest you're going to be able to 
implement and run this algorithm in a 

112
00:07:32,525 --> 00:07:34,087
reasonable amount of time.