1 00:00:00,012 --> 00:00:04,313 So, for those of you out there that fancy yourself computer scientists, I hope you 2 00:00:04,313 --> 00:00:08,019 feel some pride seeing this definition. I mean, how cool is it that our 3 00:00:08,019 --> 00:00:11,960 discipline came up with such a great concept of an NP-complete problem? The 4 00:00:11,960 --> 00:00:15,630 fact that there's a universal problem that simultaneously encodes all 5 00:00:15,630 --> 00:00:20,386 computational problems in which you can efficiently recognize a solution. 6 00:00:20,386 --> 00:00:24,858 There remains, however, a niggling issue. In general, when you're confronted with a 7 00:00:24,858 --> 00:00:29,034 mathematical definition, like the definition of NP-completeness that I just 8 00:00:29,034 --> 00:00:33,164 gave you, you should demand two things. The first thing you should demand is an 9 00:00:33,164 --> 00:00:37,134 explanation of why you should care. That is, if the definition is met, if 10 00:00:37,134 --> 00:00:39,846 satisfied, what are the interesting consequences. 11 00:00:39,846 --> 00:00:43,997 I'd like to think I've given you a very satisfying answer of that question for 12 00:00:43,997 --> 00:00:47,769 the definition of NP-completeness. I've argued that if a problem is 13 00:00:47,769 --> 00:00:52,056 NP-complete, then that's strong evidence of computational interactability. In the 14 00:00:52,056 --> 00:00:55,955 sense that polynomial-time algorithm for this NP-complete problem, if one 15 00:00:55,955 --> 00:01:00,167 hypothetically existed, that would automatically solve efficiently thousands 16 00:01:00,167 --> 00:01:03,056 of fundamental computational problems, anything for which you can efficiently 17 00:01:03,056 --> 00:01:09,142 recognize solution. But, the second thing you should demand 18 00:01:09,142 --> 00:01:13,593 from somebody who offers you a mathematical definition is examples. 19 00:01:13,593 --> 00:01:17,396 Do things that I care about actually meet this definition? 20 00:01:17,396 --> 00:01:21,042 And NP-completeness, I haven't shown you anything. 21 00:01:21,042 --> 00:01:25,504 And indeed, when you look at this interpretation, a single problem that 22 00:01:25,504 --> 00:01:30,670 simultaneously is encoding all problems with efficiently recognizable solutions. 23 00:01:30,670 --> 00:01:35,743 You might well wonder, could such objects ever exist? And the reason the theory of 24 00:01:35,743 --> 00:01:40,674 NP-completeness is so powerful, and over the past 40 years has been exported from 25 00:01:40,674 --> 00:01:45,538 computer science to all kinds of other disciplines is that, that question also 26 00:01:45,538 --> 00:01:50,207 has an incredibly satisfying answer. It turns out, tons of problems are not 27 00:01:50,207 --> 00:01:53,232 merely in NP. They don't merely have officially 28 00:01:53,232 --> 00:01:57,307 recognizable solution. Thousands and thousands of problems are 29 00:01:57,307 --> 00:02:00,952 in fact, NP-complete as hard as any other problems in NP. 30 00:02:00,952 --> 00:02:06,795 So, both the definition of NP-completeness and the facts that 31 00:02:06,795 --> 00:02:12,439 amazingly there exist NP-complete problems, that's due to independently 32 00:02:12,439 --> 00:02:16,649 Steve Cook and Leonid Levin. So, Cook and Levin came up with their 33 00:02:16,649 --> 00:02:21,358 largely similar theories independently. Cook was at that time as he is today, at 34 00:02:21,358 --> 00:02:25,337 the University of Toronto. Leonid Levin was behind the Iron Curtain, 35 00:02:25,337 --> 00:02:29,580 so he was working in the Soviet Union. So, it took awhile before his results 36 00:02:29,580 --> 00:02:32,866 were known in the West. These days, Levin is a professor at 37 00:02:32,866 --> 00:02:36,512 Boston University. Both Cook and Levin proved not just the 38 00:02:36,512 --> 00:02:41,247 basic existence result, but also gave some hints that problems that people 39 00:02:41,247 --> 00:02:43,617 really care about are also going to be NP-complete. 40 00:02:43,617 --> 00:02:48,622 For example, some constraint satisfaction problems like 3 snaps. 41 00:02:48,622 --> 00:02:53,645 But the vast scope of NP-completeness, the sheer breadth of problems that would 42 00:02:53,645 --> 00:02:58,254 eventually be proven NP-complete, was first made clear in a 1972 paper by 43 00:02:58,254 --> 00:03:01,496 Richard Karp. In that paper, he showed 21 different 44 00:03:01,496 --> 00:03:05,936 problems are NP-complete, including the traveling salesman problem, 45 00:03:05,936 --> 00:03:10,940 and various problems that many different communities had been stuck on for 46 00:03:10,940 --> 00:03:13,245 decades. And now, became clear that 47 00:03:13,245 --> 00:03:18,156 NP-completeness was the fundamental obstacle preventing progress in efficient 48 00:03:18,156 --> 00:03:25,637 algorithm across many, many domains. So, another thing that's amazing about 49 00:03:25,637 --> 00:03:31,305 NP-completeness, and a big reason for why it's been so successfully exported from, 50 00:03:31,305 --> 00:03:36,740 first of all, theoretical computer science to computer science more broadly, 51 00:03:36,740 --> 00:03:42,672 and then beyond into engineering and the other sciences, is that it's quite easy 52 00:03:42,672 --> 00:03:47,132 to stand on the shoulders of these giants and prove that new problems, problems 53 00:03:47,132 --> 00:03:49,329 that you care about, are also NP-complete. 54 00:03:49,329 --> 00:03:52,620 So, imagine that there is some computational problem pi that you really 55 00:03:52,620 --> 00:03:55,389 care about. This is just crucial to the project that 56 00:03:55,389 --> 00:03:59,783 you're working on, and you're stuck. You've been trying for weeks to solve it, 57 00:03:59,783 --> 00:04:01,980 throwing everything in your tool box added. 58 00:04:01,980 --> 00:04:05,818 You tried greedy algorithms, divide and conquer, dynamic programming, 59 00:04:05,818 --> 00:04:10,021 randomization. You've thrown every data structure in the book out of hash tables, 60 00:04:10,021 --> 00:04:12,020 heap search trees. Nothing works. 61 00:04:12,020 --> 00:04:14,546 You can't come up with an efficient algorithm. 62 00:04:14,546 --> 00:04:18,992 At that point, you should contemplate the possibility that the issue is not your 63 00:04:18,992 --> 00:04:22,932 own lack of cleverness or ingenuity. The issue is not having few, too few 64 00:04:22,932 --> 00:04:26,806 tools in your programmer toolbox. Perhaps, the issue is intractability of 65 00:04:26,806 --> 00:04:29,812 the computational problem that you're trying to solve. 66 00:04:29,812 --> 00:04:34,001 So, when you reach this point of exasperation, it's time to consider 67 00:04:34,001 --> 00:04:39,093 applying the following 2-step recipe, for establishing that the problem pi is 68 00:04:39,093 --> 00:04:42,034 NP-complete. Of course, the problem doesn't go, go 69 00:04:42,034 --> 00:04:46,618 away just because you prove it's NP-complete, but you should approach it 70 00:04:46,618 --> 00:04:51,290 using a different algorithmic strategy. And we'll discuss some of the most 71 00:04:51,290 --> 00:04:56,527 popular such strategies of approaching NP-complete problems in the rest of this 72 00:04:56,527 --> 00:04:59,722 course. So, let me state the 2-step recipe just 73 00:04:59,722 --> 00:05:03,697 at a, at a very high level. So, the first thing you need to do is 74 00:05:03,697 --> 00:05:07,332 settle on a suitable choice of an NP-complete problem, pi prime. 75 00:05:07,332 --> 00:05:11,937 The second thing you need to do is you need to show that pi prime reduces to the 76 00:05:11,937 --> 00:05:16,067 problem you care about pi. That shows that your problem is at least 77 00:05:16,067 --> 00:05:21,232 as hard as this NP-complete problem, in the sense that the NP-complete problem 78 00:05:21,232 --> 00:05:24,309 reduces to yours. And therefore, your problem, assuming 79 00:05:24,309 --> 00:05:28,268 it's an NP, is NP-complete as well. So clearly, the devil is in the details 80 00:05:28,268 --> 00:05:30,799 of successfully executing this 2-step recipe. 81 00:05:30,799 --> 00:05:35,130 And you might well be wondering, well, how on Earth do I know which NP-complete 82 00:05:35,130 --> 00:05:38,667 problem pi prime I should use? And then secondly, how am I going to come 83 00:05:38,667 --> 00:05:42,716 up with this reduction from this NP-complete problem pi prime, to my own 84 00:05:42,716 --> 00:05:45,385 problem pi? But, don't be intimidated by either of 85 00:05:45,385 --> 00:05:48,218 these two steps. With just a little bit of practice, you 86 00:05:48,218 --> 00:05:52,045 can actually get quite good at both of these steps and execute this recipe 87 00:05:52,045 --> 00:05:56,670 successfully in a lot of different cases. So, one thing that should make the first 88 00:05:56,670 --> 00:06:01,475 step less intimidating is there are some excellent lists of NP-complete problems. 89 00:06:01,475 --> 00:06:05,448 In particular, simple ones that tend to be useful in devising your own 90 00:06:05,448 --> 00:06:08,450 reductions. And the canonical such list is the book 91 00:06:08,450 --> 00:06:11,811 by Garey and Johnson called Computers and Intractability. 92 00:06:11,811 --> 00:06:14,749 It's from 1979, but it's still incredibly useful. 93 00:06:14,749 --> 00:06:18,922 I don't think I know of another Computer Science book that's more than 30 years 94 00:06:18,922 --> 00:06:22,534 old which is as useful as this one. So, there still remains the question of 95 00:06:22,534 --> 00:06:26,199 how you actually come up with one of these reductions from a known NP-complete 96 00:06:26,199 --> 00:06:29,252 problem pi prime to the problem you actually you care about, pi. 97 00:06:29,252 --> 00:06:31,491 But, don't be intimidated by this step either. 98 00:06:31,491 --> 00:06:34,900 So, first of all, as an algorithms person, you should be thinking about 99 00:06:34,900 --> 00:06:38,195 reductions all the time, anyways. You should be a very natural note of 100 00:06:38,195 --> 00:06:40,878 thought for you. For example, when we first started 101 00:06:40,878 --> 00:06:45,005 talking about all pair of shortest paths, we quickly observed that it reduces to 102 00:06:45,005 --> 00:06:48,821 the single shortest path problem. So, that mentality of, of solving one 103 00:06:48,821 --> 00:06:52,568 problem using another is equally useful in the design of NP-completeness 104 00:06:52,568 --> 00:06:55,085 reductions. And also, there's a lot of resources 105 00:06:55,085 --> 00:06:58,093 available to gain facility with NP-completeness reductions. 106 00:06:58,093 --> 00:07:00,425 You can look into various algorithms textbooks. 107 00:07:00,425 --> 00:07:04,203 Generally, they have lots of examples. The Garey & Johnson book is a good one. 108 00:07:04,203 --> 00:07:07,675 There are some online courses that study NP-completeness in more depth. 109 00:07:07,675 --> 00:07:11,536 Those resources will give you a lot of examples of NP-completeness reductions. 110 00:07:11,536 --> 00:07:15,445 It'll offer some tips on how to come up with them yourself and, most importantly, 111 00:07:15,445 --> 00:07:19,129 practice will make perfect. So, I strongly encourage you to take 112 00:07:19,129 --> 00:07:23,034 advantage of those resources. I think you'll be pleased to have 113 00:07:23,034 --> 00:07:27,690 NP-completeness as part of your toolbox. Certainly, nobody wins if you spend weeks 114 00:07:27,690 --> 00:07:31,542 or months of your life in inadvertently trying to prove that P=NP.