1 00:00:00,012 --> 00:00:04,344 Let's see how this substructure lemma naturally leads to a recursive search 2 00:00:04,344 --> 00:00:06,661 algorithm. The algorithm is recursive. 3 00:00:06,661 --> 00:00:10,967 I'm not going to bother to specify the base cases, which is when you're given a 4 00:00:10,967 --> 00:00:14,731 graph, with a most 1 vertex. Or whether you're given a value of k, 5 00:00:14,731 --> 00:00:17,664 equal to 0 or 1. So let's assume that the graph has a 6 00:00:17,664 --> 00:00:20,802 least 2 vertices, and that k is at least equal to 2. 7 00:00:20,802 --> 00:00:25,335 I wouldn't be surprised if this algorithm reminds you, of some of the recursive 8 00:00:25,335 --> 00:00:29,655 implementations we discussed for various dynamic programming algorithms. 9 00:00:29,655 --> 00:00:33,861 And in the same way those algorithms exhibited exponential running time, 10 00:00:33,861 --> 00:00:38,252 without memoization, we're going to see an exponential running time here. 11 00:00:38,252 --> 00:00:41,817 For all of our dynamic programming algorithms, we were able to cleverly 12 00:00:41,817 --> 00:00:45,901 organize sub-problems from smallest to largest, thereby eliminating redundantly 13 00:00:45,901 --> 00:00:49,898 solving the same sub-problems over and over again, yielding polynomial run time 14 00:00:49,898 --> 00:00:52,092 bounds. Here, we're going to be stuck with the 15 00:00:52,092 --> 00:00:56,040 exponential running time bound. Obviously not surprising given Given that 16 00:00:56,040 --> 00:00:59,691 it's an NP complete problem, but still, if you really want to understand this 17 00:00:59,691 --> 00:01:02,903 deeply, take this recursive search algorithm.Try to come up with a 18 00:01:02,903 --> 00:01:06,762 polynomial time version,try to come up with a small set of sub problems, ordered 19 00:01:06,762 --> 00:01:10,157 from smallest to largest that you can solve systematically.You will be 20 00:01:10,157 --> 00:01:13,941 inevitably stymied in those attempts, but you'll better appreciate what's non 21 00:01:13,941 --> 00:01:16,342 trivial about the recursive solution here. 22 00:01:16,342 --> 00:01:20,248 In step 1 of the alrogithm, we pick some arbitray edge, u, comma v. 23 00:01:20,248 --> 00:01:24,410 Why is it useful to focus on an edge? Well, by the definition of a vertex 24 00:01:24,410 --> 00:01:28,291 cover, it has to include either u or v, so that's our initial clue. 25 00:01:28,291 --> 00:01:32,929 We're going to proceed optimistically. We're looking for a vertex cover of size 26 00:01:32,929 --> 00:01:37,016 K, so let's assume one exists. What does the sub structure limit tells 27 00:01:37,016 --> 00:01:40,242 us? It says. That if such a solution exists, then 28 00:01:40,242 --> 00:01:45,128 necessarily either gu or gv or both themselves have small vertex covers, of 29 00:01:45,128 --> 00:01:48,790 size only k - 1. Moreover, it's a simple matter to extend 30 00:01:48,790 --> 00:01:53,535 such vertex covers to a small 1 for g, you just augment them by the missing 31 00:01:53,535 --> 00:01:56,399 vertex. So let's first adopt as a working 32 00:01:56,399 --> 00:02:00,878 hypothesis that g sub u has a smaller disk number of size only k-1. 33 00:02:00,878 --> 00:02:05,965 Let's recursively try to find it. If our recursive search comes back with a 34 00:02:05,965 --> 00:02:10,292 solution of vertex number of size k-1 for g-u, we're good to go. 35 00:02:10,292 --> 00:02:15,505 We just augment that by the vertex u and that's our vertex number of size k for 36 00:02:15,505 --> 00:02:21,072 the original graph of G If that recursive call fails to find the small vertex cover 37 00:02:21,072 --> 00:02:25,822 will you say, oh well then it must be that v is the key to getting a small 38 00:02:25,822 --> 00:02:29,122 vertex cover. So we do exactly the same thing, we 39 00:02:29,122 --> 00:02:33,038 recursively search g sub v for a vertex cover of size K - 1. 40 00:02:33,038 --> 00:02:38,677 If the second recursive call also fails to find a small vertex cover, one of size 41 00:02:38,677 --> 00:02:44,104 only k - 1, then by the substructure claim, the other direction, we know that 42 00:02:44,104 --> 00:02:48,142 the original graph G cannot have a vertex cover of size k. 43 00:02:48,142 --> 00:02:51,422 If it did, the substructure limit tells us. 44 00:02:51,422 --> 00:02:54,486 One of the 2 recursive calls would have succeeded. 45 00:02:54,486 --> 00:02:59,191 Since they both failed, we conclude correctly that the original graph did not 46 00:02:59,191 --> 00:03:03,402 have a small vertex cover. The correctness analysis is straight for, 47 00:03:03,402 --> 00:03:06,508 forward, formally you would proceed by induction. 48 00:03:06,508 --> 00:03:10,962 So the induction hypothesis would then guarantee the correctness of the 2 49 00:03:10,962 --> 00:03:15,675 recursive calls, so on the smaller, our sub problem is GU and GV the recursive 50 00:03:15,675 --> 00:03:20,472 calls correctly detect whether or not there are vertex covers of size K-1. 51 00:03:20,472 --> 00:03:24,927 The substructure limit guarantees that we correctly compile the results of the two 52 00:03:24,927 --> 00:03:27,689 recursive calls into the output of our algorithms. 53 00:03:27,689 --> 00:03:31,132 So if one of the two recursive calls succeeds, we're good to go. 54 00:03:31,132 --> 00:03:35,560 We just output the vertex cover of size K as desired, and this is the crucial part, 55 00:03:35,560 --> 00:03:39,827 if both of the recursive calls. Fail then by sub-structure liner we know, 56 00:03:39,827 --> 00:03:43,557 we correctly report fail. We know there is not a vertex cover of 57 00:03:43,557 --> 00:03:48,232 size k or less in the original graph G. So lets analyse the running time using an 58 00:03:48,232 --> 00:03:51,627 Adhoc analysis. Lets think about the number of workers of 59 00:03:51,627 --> 00:03:55,657 cost that there can be over the entire trajectory of the algorithm or 60 00:03:55,657 --> 00:04:00,467 equivalently the number of nodes in the recursion tree generated by the algorithm 61 00:04:00,467 --> 00:04:04,702 and then we'll come up with the bound on how much work is done on a given. 62 00:04:04,702 --> 00:04:09,347 recursive of call, not counting the work done by its recursive subcalls. 63 00:04:09,347 --> 00:04:13,543 And now, the cool part, and this is really where the rule of a small value of 64 00:04:13,543 --> 00:04:17,922 k comes into play is that, each time we recurse, we subtract 1 from the target 65 00:04:17,922 --> 00:04:20,490 size of a vertex cover that we're looking for. 66 00:04:20,490 --> 00:04:23,524 Remember, if we are given, as input, the parameter k. 67 00:04:23,524 --> 00:04:27,772 Each of our recursive calls involve. The parameter value of K minus 1. 68 00:04:27,772 --> 00:04:32,792 So what this does is limit the recursion depth or the depth of the tree generated 69 00:04:32,792 --> 00:04:37,512 by this algorithm to be at most K. Initially you start with K, each time you 70 00:04:37,512 --> 00:04:42,442 recurse you decrement it by 1 and you're going to hit base cases by the time K is 71 00:04:42,442 --> 00:04:45,962 around 1 or 0. Putting those two observations together, 72 00:04:45,962 --> 00:04:49,642 branching factor banded by two, recursion depth bounded by K. 73 00:04:49,642 --> 00:04:53,917 The number of recursive calls in total, over the entire life time of this 74 00:04:53,917 --> 00:04:56,687 algorithm, is going to be bounded by two to the K. 75 00:04:56,687 --> 00:05:01,427 Moreover, if you inspect the pseudo code, you see that not a whole lot of actions 76 00:05:01,427 --> 00:05:04,147 going on, other than the recursive sub calls. 77 00:05:04,147 --> 00:05:08,152 So even with a very sloppy implementation, where you have construct 78 00:05:08,152 --> 00:05:11,047 a G sub U, or a G sub V from the input graph g. 79 00:05:11,047 --> 00:05:15,552 Even if you do that crudely, you're going to get a linear time bound big o of m 80 00:05:15,552 --> 00:05:19,982 work, per recursive call, not counting the work that gets done later by the 81 00:05:19,982 --> 00:05:23,467 recursive sub-calls. So, that means an upward bound on the 82 00:05:23,467 --> 00:05:27,562 total amount of work done by this algorithm is the number of recursive 83 00:05:27,562 --> 00:05:30,702 calls, 2 ^ k, times the work per call. M 2^K*M. 84 00:05:30,702 --> 00:05:37,182 Now this running time you'll of course notice still exponential in K, but, of 85 00:05:37,182 --> 00:05:42,637 course it has to be exponential unless we intend on proving P=NP. 86 00:05:42,637 --> 00:05:47,212 Don't forget the vertex cover problem is NP complete. 87 00:05:47,212 --> 00:05:51,338 So what have we accomplished? We had a trivial exponential time algorithm by 88 00:05:51,338 --> 00:05:54,992 brute force search earlier. Why did we go to this rigamorole to come 89 00:05:54,992 --> 00:05:58,596 up with the second algorithm with this exponential run time bound. 90 00:05:58,596 --> 00:06:02,663 Well, this is a qualitatively better running time bound than we have before 91 00:06:02,663 --> 00:06:06,317 with naive brute force search. Here's 2 ways to see why this running 92 00:06:06,317 --> 00:06:11,052 time bound is qualitatively superior So first, let's just inspect this running 93 00:06:11,052 --> 00:06:15,783 time from a mathematical perspective and let's ask how large can we take K while 94 00:06:15,783 --> 00:06:20,332 still having a polynomial time algorithm. Recall that our naive brute force 95 00:06:20,332 --> 00:06:24,840 searched algorithm where you just try each sub set of K vertices random time 96 00:06:24,840 --> 00:06:28,449 theta of n to the k. And so that's polynomials Time if k is 97 00:06:28,449 --> 00:06:33,862 constant, and it's super-polynomial time if k is bigger than a constant. 98 00:06:33,862 --> 00:06:39,309 For this running time bound 2 ^ k * m, we can let k grow as large as log of the 99 00:06:39,309 --> 00:06:44,307 graph size and still have a polynomial Algorithm, So now we have the sig way 100 00:06:44,307 --> 00:06:48,562 from these mathematical balance to thinking through actually implementing 101 00:06:48,562 --> 00:06:51,180 these algorithms running them on actual data. 102 00:06:51,180 --> 00:06:55,412 So for the naive per search you have got this running time of of n to the k so 103 00:06:55,412 --> 00:06:59,346 even for a relatively small graph you know lets say n is may be fifty or 104 00:06:59,346 --> 00:07:03,674 hundred you are not going to be able, You're not going to be able to run this 105 00:07:03,674 --> 00:07:08,258 brute for-, naive brute force search algorithm except for extremely small 106 00:07:08,258 --> 00:07:10,920 values of k. 3, 4, maybe 5 if you're lucky. 107 00:07:10,920 --> 00:07:15,322 So naive brute force search has very limited range of applicability. 108 00:07:15,322 --> 00:07:19,684 So by contrast our smartest risk algorithm can accommodate a noticeably 109 00:07:19,684 --> 00:07:23,384 larger range of k's. So remember the running time here is 2 to 110 00:07:23,384 --> 00:07:27,996 the k * the graph size, and so even for values of k up to 20, for many networks 111 00:07:27,996 --> 00:07:32,525 of interest you're going to be able to implement and run this algorithm in a 112 00:07:32,525 --> 00:07:34,087 reasonable amount of time.