1
00:00:00,012 --> 00:00:04,313
So, for those of you out there that fancy 
yourself computer scientists, I hope you 

2
00:00:04,313 --> 00:00:08,019
feel some pride seeing this definition. 
I mean, how cool is it that our 

3
00:00:08,019 --> 00:00:11,960
discipline came up with such a great 
concept of an NP-complete problem? The 

4
00:00:11,960 --> 00:00:15,630
fact that there's a universal problem 
that simultaneously encodes all 

5
00:00:15,630 --> 00:00:20,386
computational problems in which you can 
efficiently recognize a solution. 

6
00:00:20,386 --> 00:00:24,858
There remains, however, a niggling issue. 
In general, when you're confronted with a 

7
00:00:24,858 --> 00:00:29,034
mathematical definition, like the 
definition of NP-completeness that I just 

8
00:00:29,034 --> 00:00:33,164
gave you, you should demand two things. 
The first thing you should demand is an 

9
00:00:33,164 --> 00:00:37,134
explanation of why you should care. 
That is, if the definition is met, if 

10
00:00:37,134 --> 00:00:39,846
satisfied, what are the interesting 
consequences. 

11
00:00:39,846 --> 00:00:43,997
I'd like to think I've given you a very 
satisfying answer of that question for 

12
00:00:43,997 --> 00:00:47,769
the definition of NP-completeness. 
I've argued that if a problem is 

13
00:00:47,769 --> 00:00:52,056
NP-complete, then that's strong evidence 
of computational interactability. In the 

14
00:00:52,056 --> 00:00:55,955
sense that polynomial-time algorithm for 
this NP-complete problem, if one 

15
00:00:55,955 --> 00:01:00,167
hypothetically existed, that would 
automatically solve efficiently thousands 

16
00:01:00,167 --> 00:01:03,056
of fundamental computational problems, 
anything for which you can efficiently 

17
00:01:03,056 --> 00:01:09,142
recognize solution. 
But, the second thing you should demand 

18
00:01:09,142 --> 00:01:13,593
from somebody who offers you a 
mathematical definition is examples. 

19
00:01:13,593 --> 00:01:17,396
Do things that I care about actually meet 
this definition? 

20
00:01:17,396 --> 00:01:21,042
And NP-completeness, I haven't shown you 
anything. 

21
00:01:21,042 --> 00:01:25,504
And indeed, when you look at this 
interpretation, a single problem that 

22
00:01:25,504 --> 00:01:30,670
simultaneously is encoding all problems 
with efficiently recognizable solutions. 

23
00:01:30,670 --> 00:01:35,743
You might well wonder, could such objects 
ever exist? And the reason the theory of 

24
00:01:35,743 --> 00:01:40,674
NP-completeness is so powerful, and over 
the past 40 years has been exported from 

25
00:01:40,674 --> 00:01:45,538
computer science to all kinds of other 
disciplines is that, that question also 

26
00:01:45,538 --> 00:01:50,207
has an incredibly satisfying answer. 
It turns out, tons of problems are not 

27
00:01:50,207 --> 00:01:53,232
merely in NP. 
They don't merely have officially 

28
00:01:53,232 --> 00:01:57,307
recognizable solution. 
Thousands and thousands of problems are 

29
00:01:57,307 --> 00:02:00,952
in fact, NP-complete as hard as any other 
problems in NP. 

30
00:02:00,952 --> 00:02:06,795
So, both the definition of 
NP-completeness and the facts that 

31
00:02:06,795 --> 00:02:12,439
amazingly there exist NP-complete 
problems, that's due to independently 

32
00:02:12,439 --> 00:02:16,649
Steve Cook and Leonid Levin. 
So, Cook and Levin came up with their 

33
00:02:16,649 --> 00:02:21,358
largely similar theories independently. 
Cook was at that time as he is today, at 

34
00:02:21,358 --> 00:02:25,337
the University of Toronto. 
Leonid Levin was behind the Iron Curtain, 

35
00:02:25,337 --> 00:02:29,580
so he was working in the Soviet Union. 
So, it took awhile before his results 

36
00:02:29,580 --> 00:02:32,866
were known in the West. 
These days, Levin is a professor at 

37
00:02:32,866 --> 00:02:36,512
Boston University. 
Both Cook and Levin proved not just the 

38
00:02:36,512 --> 00:02:41,247
basic existence result, but also gave 
some hints that problems that people 

39
00:02:41,247 --> 00:02:43,617
really care about are also going to be 
NP-complete. 

40
00:02:43,617 --> 00:02:48,622
For example, some constraint satisfaction 
problems like 3 snaps. 

41
00:02:48,622 --> 00:02:53,645
But the vast scope of NP-completeness, 
the sheer breadth of problems that would 

42
00:02:53,645 --> 00:02:58,254
eventually be proven NP-complete, was 
first made clear in a 1972 paper by 

43
00:02:58,254 --> 00:03:01,496
Richard Karp. 
In that paper, he showed 21 different 

44
00:03:01,496 --> 00:03:05,936
problems are NP-complete, 
including the traveling salesman problem, 

45
00:03:05,936 --> 00:03:10,940
and various problems that many different 
communities had been stuck on for 

46
00:03:10,940 --> 00:03:13,245
decades. 
And now, became clear that 

47
00:03:13,245 --> 00:03:18,156
NP-completeness was the fundamental 
obstacle preventing progress in efficient 

48
00:03:18,156 --> 00:03:25,637
algorithm across many, many domains. 
So, another thing that's amazing about 

49
00:03:25,637 --> 00:03:31,305
NP-completeness, and a big reason for why 
it's been so successfully exported from, 

50
00:03:31,305 --> 00:03:36,740
first of all, theoretical computer 
science to computer science more broadly, 

51
00:03:36,740 --> 00:03:42,672
and then beyond into engineering and the 
other sciences, is that it's quite easy 

52
00:03:42,672 --> 00:03:47,132
to stand on the shoulders of these giants 
and prove that new problems, problems 

53
00:03:47,132 --> 00:03:49,329
that you care about, are also 
NP-complete. 

54
00:03:49,329 --> 00:03:52,620
So, imagine that there is some 
computational problem pi that you really 

55
00:03:52,620 --> 00:03:55,389
care about. 
This is just crucial to the project that 

56
00:03:55,389 --> 00:03:59,783
you're working on, and you're stuck. 
You've been trying for weeks to solve it, 

57
00:03:59,783 --> 00:04:01,980
throwing everything in your tool box 
added. 

58
00:04:01,980 --> 00:04:05,818
You tried greedy algorithms, divide and 
conquer, dynamic programming, 

59
00:04:05,818 --> 00:04:10,021
randomization. You've thrown every data 
structure in the book out of hash tables, 

60
00:04:10,021 --> 00:04:12,020
heap search trees. 
Nothing works. 

61
00:04:12,020 --> 00:04:14,546
You can't come up with an efficient 
algorithm. 

62
00:04:14,546 --> 00:04:18,992
At that point, you should contemplate the 
possibility that the issue is not your 

63
00:04:18,992 --> 00:04:22,932
own lack of cleverness or ingenuity. 
The issue is not having few, too few 

64
00:04:22,932 --> 00:04:26,806
tools in your programmer toolbox. 
Perhaps, the issue is intractability of 

65
00:04:26,806 --> 00:04:29,812
the computational problem that you're 
trying to solve. 

66
00:04:29,812 --> 00:04:34,001
So, when you reach this point of 
exasperation, it's time to consider 

67
00:04:34,001 --> 00:04:39,093
applying the following 2-step recipe, for 
establishing that the problem pi is 

68
00:04:39,093 --> 00:04:42,034
NP-complete. 
Of course, the problem doesn't go, go 

69
00:04:42,034 --> 00:04:46,618
away just because you prove it's 
NP-complete, but you should approach it 

70
00:04:46,618 --> 00:04:51,290
using a different algorithmic strategy. 
And we'll discuss some of the most 

71
00:04:51,290 --> 00:04:56,527
popular such strategies of approaching 
NP-complete problems in the rest of this 

72
00:04:56,527 --> 00:04:59,722
course. 
So, let me state the 2-step recipe just 

73
00:04:59,722 --> 00:05:03,697
at a, at a very high level. 
So, the first thing you need to do is 

74
00:05:03,697 --> 00:05:07,332
settle on a suitable choice of an 
NP-complete problem, pi prime. 

75
00:05:07,332 --> 00:05:11,937
The second thing you need to do is you 
need to show that pi prime reduces to the 

76
00:05:11,937 --> 00:05:16,067
problem you care about pi. 
That shows that your problem is at least 

77
00:05:16,067 --> 00:05:21,232
as hard as this NP-complete problem, 
in the sense that the NP-complete problem 

78
00:05:21,232 --> 00:05:24,309
reduces to yours. 
And therefore, your problem, assuming 

79
00:05:24,309 --> 00:05:28,268
it's an NP, is NP-complete as well. 
So clearly, the devil is in the details 

80
00:05:28,268 --> 00:05:30,799
of successfully executing this 2-step 
recipe. 

81
00:05:30,799 --> 00:05:35,130
And you might well be wondering, well, 
how on Earth do I know which NP-complete 

82
00:05:35,130 --> 00:05:38,667
problem pi prime I should use? 
And then secondly, how am I going to come 

83
00:05:38,667 --> 00:05:42,716
up with this reduction from this 
NP-complete problem pi prime, to my own 

84
00:05:42,716 --> 00:05:45,385
problem pi? 
But, don't be intimidated by either of 

85
00:05:45,385 --> 00:05:48,218
these two steps. 
With just a little bit of practice, you 

86
00:05:48,218 --> 00:05:52,045
can actually get quite good at both of 
these steps and execute this recipe 

87
00:05:52,045 --> 00:05:56,670
successfully in a lot of different cases. 
So, one thing that should make the first 

88
00:05:56,670 --> 00:06:01,475
step less intimidating is there are some 
excellent lists of NP-complete problems. 

89
00:06:01,475 --> 00:06:05,448
In particular, simple ones that tend to 
be useful in devising your own 

90
00:06:05,448 --> 00:06:08,450
reductions. 
And the canonical such list is the book 

91
00:06:08,450 --> 00:06:11,811
by Garey and Johnson called Computers and 
Intractability. 

92
00:06:11,811 --> 00:06:14,749
It's from 1979, but it's still incredibly 
useful. 

93
00:06:14,749 --> 00:06:18,922
I don't think I know of another Computer 
Science book that's more than 30 years 

94
00:06:18,922 --> 00:06:22,534
old which is as useful as this one. 
So, there still remains the question of 

95
00:06:22,534 --> 00:06:26,199
how you actually come up with one of 
these reductions from a known NP-complete 

96
00:06:26,199 --> 00:06:29,252
problem pi prime to the problem you 
actually you care about, pi. 

97
00:06:29,252 --> 00:06:31,491
But, don't be intimidated by this step 
either. 

98
00:06:31,491 --> 00:06:34,900
So, first of all, as an algorithms 
person, you should be thinking about 

99
00:06:34,900 --> 00:06:38,195
reductions all the time, anyways. 
You should be a very natural note of 

100
00:06:38,195 --> 00:06:40,878
thought for you. 
For example, when we first started 

101
00:06:40,878 --> 00:06:45,005
talking about all pair of shortest paths, 
we quickly observed that it reduces to 

102
00:06:45,005 --> 00:06:48,821
the single shortest path problem. 
So, that mentality of, of solving one 

103
00:06:48,821 --> 00:06:52,568
problem using another is equally useful 
in the design of NP-completeness 

104
00:06:52,568 --> 00:06:55,085
reductions. 
And also, there's a lot of resources 

105
00:06:55,085 --> 00:06:58,093
available to gain facility with 
NP-completeness reductions. 

106
00:06:58,093 --> 00:07:00,425
You can look into various algorithms 
textbooks. 

107
00:07:00,425 --> 00:07:04,203
Generally, they have lots of examples. 
The Garey & Johnson book is a good one. 

108
00:07:04,203 --> 00:07:07,675
There are some online courses that study 
NP-completeness in more depth. 

109
00:07:07,675 --> 00:07:11,536
Those resources will give you a lot of 
examples of NP-completeness reductions. 

110
00:07:11,536 --> 00:07:15,445
It'll offer some tips on how to come up 
with them yourself and, most importantly, 

111
00:07:15,445 --> 00:07:19,129
practice will make perfect. 
So, I strongly encourage you to take 

112
00:07:19,129 --> 00:07:23,034
advantage of those resources. 
I think you'll be pleased to have 

113
00:07:23,034 --> 00:07:27,690
NP-completeness as part of your toolbox. 
Certainly, nobody wins if you spend weeks 

114
00:07:27,690 --> 00:07:31,542
or months of your life in inadvertently 
trying to prove that P=NP.