1
00:00:00,000 --> 00:00:08,137
. Today, we're going to talk about
intractability in our last lecture. this

2
00:00:08,137 --> 00:00:13,203
is an appropriate topic to close the
course on because it really lays out for

3
00:00:13,203 --> 00:00:18,530
us some of the big challenges of the
future. most people are taking this course

4
00:00:18,530 --> 00:00:23,077
because they're going to want to be
involved with algorithms, algorithms in

5
00:00:23,077 --> 00:00:30,857
the future. and this topic really helps us
lay out what's ahead. As an introduction,

6
00:00:30,857 --> 00:00:36,762
we'll review some very, very basic
questions about computation. Some of you

7
00:00:36,762 --> 00:00:43,127
may be familiar with this subject area
others may not be at all. So, it will just

8
00:00:43,127 --> 00:00:49,569
be a brief review in either case. But some
of the concepts are relatively easy to

9
00:00:49,569 --> 00:00:55,628
understand and so, and they lay a good
background for starting with computation

10
00:00:55,858 --> 00:01:01,150
with intractability. We need to think
about computation itself first. There are

11
00:01:01,760 --> 00:01:07,279
these kinds of questions that came up
actually about the same time or even

12
00:01:07,279 --> 00:01:13,288
before computers actually were invented or
came into use. Now, what is a general

13
00:01:13,288 --> 00:01:19,222
purpose computer? What does that mean? are
there limits on the power of digital

14
00:01:19,222 --> 00:01:25,231
computers? That's another interesting
question. Are there limits on the power of

15
00:01:25,231 --> 00:01:31,244
any machine that we can build? this kind
of question with the outgrowth of similar

16
00:01:31,629 --> 00:01:38,096
questions on mathematics that was posed by
David Hilbert in the beginning of the

17
00:01:38,096 --> 00:01:44,023
twentieth century. and there were some
very profound thinking going around

18
00:01:44,023 --> 00:01:50,951
Hilbert's questions, and later, around the
questions on computers that initially were

19
00:01:50,951 --> 00:01:57,264
related, but then took on a new life of
their own. To take a look at the idea of

20
00:01:57,495 --> 00:02:02,895
what these mathematicians were thinking
about in the middle of the twentieth

21
00:02:02,903 --> 00:02:08,555
century that we call the simple model of
computation that we looked at when we were

22
00:02:08,555 --> 00:02:15,758
working with the KnuthMorrisPratt
algorithm. so what we did was we imagined

23
00:02:15,758 --> 00:02:22,134
a, a machine, an abstract machine called a
discrete finite automaton or an FSA, a

24
00:02:22,134 --> 00:02:27,421
finite state automaton. that is a very
simple machine where we imagine that it's

25
00:02:27,421 --> 00:02:32,771
got a tape that has the input. that's just
a long strip that's divided into cells.

26
00:02:32,771 --> 00:02:37,676
It's got a finite alph, alphabet of
symbols and it had a tape head that could

27
00:02:37,676 --> 00:02:43,090
read one symbol from one cell of the, of
the tape. And then, based on what it saw,

28
00:02:43,090 --> 00:02:48,042
it would move to a different state. It
would move one cell at a time, read up all

29
00:02:48,042 --> 00:02:52,881
the tape, all the input, and if it got to
the right state, we'd say it'd accept it.

30
00:02:52,881 --> 00:02:58,325
That's what we did with KnuthMorrisPratt
was we imagined this machine and then we

31
00:02:58,325 --> 00:03:03,044
built a program that now constructed a
table which just said what to do, which

32
00:03:03,044 --> 00:03:08,428
state to go to for each symbol on the each
possible symbol on the tape for each

33
00:03:08,428 --> 00:03:14,234
state. and then that was a way that we got
this problem solved. in, a natural

34
00:03:14,234 --> 00:03:19,612
question arises when mathematicians and
theoretical computer scientists first

35
00:03:19,612 --> 00:03:25,057
started taking a look at abstract models
of computation like this is, can you make

36
00:03:25,057 --> 00:03:30,170
them more powerful, is there a more
powerful model of computation than DFA?

37
00:03:30,170 --> 00:03:35,102
Well, it didn't, it didn't take long to
realize, of course the, there's a more

38
00:03:35,102 --> 00:03:40,824
powerful model. And actually, the progress
was almost incremental. You can think of

39
00:03:40,824 --> 00:03:46,677
it as being almost incremental where you
just add a little bit of power to the DFA

40
00:03:46,874 --> 00:03:52,930
that we can show, that makes that more
powerful but, but what's the limit? and

41
00:03:52,930 --> 00:04:01,379
actually the, the limit was proposed very
early on by Turing in here at Princeton in

42
00:04:01,379 --> 00:04:09,208
the 1930s. it's a universal model of
computation where just like the DFA, we

43
00:04:09,208 --> 00:04:16,506
imagine a machine that has a finite set of
states. but instead of just reading the

44
00:04:16,506 --> 00:04:23,750
tape, it can write on the tape as well and
it can move in both directions. so, it, it

45
00:04:23,750 --> 00:04:30,039
can store intermediate results on the
tape. but, other than that, it's very

46
00:04:30,039 --> 00:04:36,647
similar to what we imagined for the FSA.
And now the question is, is there a more

47
00:04:36,647 --> 00:04:42,742
powerful model of computation than this
one? and the remarkable fact is that

48
00:04:42,742 --> 00:04:48,074
Turing answered this question in the
1930s. Actually, he did the work just

49
00:04:48,074 --> 00:04:54,147
before he got here to Princeton, and then
finished the paper. He said, no, there's

50
00:04:54,147 --> 00:04:59,924
no more powerful model of computation. In
fact, many of hailed this as the most

51
00:04:59,924 --> 00:05:05,892
important scientific result of the
twentieth century. because it, it tells us

52
00:05:06,127 --> 00:05:12,412
really wha t we need to know about the
power of the machines that we build. so

53
00:05:12,412 --> 00:05:19,403
with Church, who was Turing's adviser here
at Princeton, the we have the thesis that

54
00:05:19,638 --> 00:05:26,079
there's no more powerful machine. Turing
machines can compute any function that can

55
00:05:26,079 --> 00:05:32,216
be computed by a physically harnessable
process of the natural world. Now, this is

56
00:05:32,216 --> 00:05:37,050
not a mathematical theorem because it's a
statement about the natural world. It's,

57
00:05:37,050 --> 00:05:43,225
it's not something that you can prove from
basic principles. but it is something that

58
00:05:43,225 --> 00:05:48,964
you can falsify, you can run an experiment
that shows it not to be true. And the

59
00:05:48,964 --> 00:05:54,195
whole idea behind this thesis and, and
behind Turing's proof about abstract

60
00:05:54,195 --> 00:05:59,644
models of computation is that and we've
seen it many, many times. So, we can use

61
00:05:59,644 --> 00:06:05,383
simulation to prove model equivalence. So,
we wrote a Java program to simulate a

62
00:06:05,383 --> 00:06:11,050
discrete finite automaton, and that means
that Java's at least as powerful as that.

63
00:06:11,303 --> 00:06:18,249
you can write people have shown you can
write Turing machines that simulate the

64
00:06:18,249 --> 00:06:23,669
operation of, of store program computers,
like the one's that we use. Or if you use

65
00:06:23,669 --> 00:06:28,716
an iPhone and you're running an app that's
written for the Android or vice versa,

66
00:06:28,716 --> 00:06:33,825
that simulation that proves models that
are equivalent, demonstrates that models

67
00:06:33,825 --> 00:06:38,062
are equivalent. And that's been an
extremely effective way for us to

68
00:06:38,062 --> 00:06:42,299
understand that different computers have
similar capabilities. That, you know,

69
00:06:42,860 --> 00:06:48,115
you're not going to compute more by build,
building some new machine. so these had

70
00:06:48,115 --> 00:06:54,521
profound implications. And, it's, it's
really amazing that Turing came up with

71
00:06:54,521 --> 00:07:01,229
this so long ago really before computers
came into use at all. and it means that we

72
00:07:01,229 --> 00:07:07,032
don't have to try to invent a more
powerful machine, or, or language and it

73
00:07:07,032 --> 00:07:13,513
really enables rigorous study of the
computation that we can expect to do in

74
00:07:13,513 --> 00:07:20,392
the natural world. so that's the basis for
the theory of computation. And it's, it's

75
00:07:20,392 --> 00:07:26,958
related to the study of algorithms. but,
because it gives us a simple and universal

76
00:07:26,958 --> 00:07:33,247
model of computation. But we haven't
talked about efficiency. how long does it

77
00:07:33,247 --> 00:07:40,623
take to simulate an Android to s imulate
an iPhone and so forth? So, what about

78
00:07:40,903 --> 00:07:46,844
cost? what about time? That's been our
focus when developing algorithms. and what

79
00:07:46,844 --> 00:07:53,743
we want to think about next is how to deal
with that cost. but first, I just want to

80
00:07:53,743 --> 00:07:58,944
lay out the evidence that we have for the
Church-Turing thesis. That is, it's been

81
00:07:58,944 --> 00:08:04,205
eight, eight decades and nobody's found a
machine that can compete more than we've

82
00:08:04,205 --> 00:08:09,107
been computing with the machines that we
have. And the other thing that was really

83
00:08:09,107 --> 00:08:14,369
convincing to the mathematicians in the
mid-20th century is that people tried all

84
00:08:14,369 --> 00:08:20,660
different types of models of computation.
Church was working on something called the

85
00:08:20,660 --> 00:08:26,351
un-typed lambda calculus, which actually
is the basis for modern programming

86
00:08:26,351 --> 00:08:31,294
languages. And there are other
mathematicians working on recursive

87
00:08:31,294 --> 00:08:37,136
function theory and grammars and other
things. And eventually, essentially, they

88
00:08:37,136 --> 00:08:43,202
were all able to use these languages to
express the formalism of another one, and

89
00:08:43,202 --> 00:08:49,128
then therefore, prove them to be all
equivalent. And that comes in through to

90
00:08:49,128 --> 00:08:54,772
the modern day with all programming
languages. You can write a simulator or a

91
00:08:54,772 --> 00:09:00,910
compiler for one programming language and
another, or a different random access

92
00:09:00,910 --> 00:09:06,554
machines, simple machines, like cellular
automaton. and even computing using

93
00:09:06,554 --> 00:09:12,621
biological operations on DNA. or quantum
computers that you may have heard about.

94
00:09:12,833 --> 00:09:18,547
all of these things have turned out to be
equivalent. So, there's a, a lot of

95
00:09:18,759 --> 00:09:26,543
confidence that the Church-Turing thesis
holds. but what about efficiency? So, this

96
00:09:26,543 --> 00:09:32,455
is a similar starting point for the study
of algorithms. What algorithms are useful

97
00:09:32,455 --> 00:09:38,560
in practice? so in order to know what the,
what useful we're going to consider time

98
00:09:38,560 --> 00:09:44,669
and we're going to say that it's an
algorithm that it's not just that it's in

99
00:09:44,669 --> 00:09:50,995
theory, would, would complete the problem
is that it actually would solve the

100
00:09:50,995 --> 00:09:57,570
problem in the time that we have
available. and just to set a dividing

101
00:09:57,570 --> 00:10:04,488
line, it a, a, a clear bright line it, it,
we're going to say is that we'll consider

102
00:10:04,488 --> 00:10:10,001
it to be useful in practice or efficient.
If it's running time is guaran teed to be

103
00:10:10,001 --> 00:10:15,716
bounded by some polynomial in the size of
the input for all inputs. and for the

104
00:10:15,716 --> 00:10:21,566
purpose of this theory, we don't even care
of what the exponent is, could be n to the

105
00:10:21,566 --> 00:10:27,348
100th. now you might argue that's not a
very useful dividing line but it is a

106
00:10:27,348 --> 00:10:32,794
useful starting point. and as you'll see,
even the starting point has told us a

107
00:10:32,794 --> 00:10:41,068
great deal about the kinds of algorithms
useful in practice. and, this was, this

108
00:10:41,068 --> 00:10:49,356
work was started in the 50s.' also, here
at Princeton. and many, many people looked

109
00:10:49,356 --> 00:10:56,235
at models for trying to understand what,
what this meant. the dividing line is

110
00:10:56,235 --> 00:11:03,113
between polynomial as, as I've said, and
exponential. Exponential is a two the, the

111
00:11:03,113 --> 00:11:09,909
n-th power of some constant of greater
than one to the n-th power. And we'll talk

112
00:11:09,909 --> 00:11:18,960
about that in just a second. so sorting is
useful. It, so anytime it's bounded by a

113
00:11:18,960 --> 00:11:26,882
polynomial or, but here's another one
which is finding the best traveling

114
00:11:26,882 --> 00:11:33,307
salesperson tour on endpoints. if you're
going to try all possible tours that

115
00:11:33,307 --> 00:11:38,574
algorithm is not useful in practice
because n factorial is not bounded by a

116
00:11:38,574 --> 00:11:44,257
polynomial in n. Its running time actually
grows like n to the n-th. so, this is a

117
00:11:44,257 --> 00:11:49,940
useful starting point for our theory
because it's very broad and robust. and if

118
00:11:49,940 --> 00:11:55,554
we can put some algorithms in the class
with sorting and other algorithms in the

119
00:11:55,554 --> 00:12:01,010
class with traveling salesperson using
brute, brute force search, that's going to

120
00:12:01,010 --> 00:12:05,738
be useful. It'll tell us which algorithms
we want to use. We definitely can use the,

121
00:12:05,738 --> 00:12:11,240
think about using the first but we can't
think about using the second. and, and

122
00:12:11,456 --> 00:12:17,216
actually, unlike turing machines, where we
wouldn't actually kinda play turing

123
00:12:17,216 --> 00:12:23,265
machine to solve a practical problem. in
this theory, it's often the case that the

124
00:12:23,265 --> 00:12:29,169
polynomial time algorithms will scale to
help us solve large problems because the

125
00:12:29,169 --> 00:12:35,120
constants usually tend to be small. but
the, the key is the idea of exponential

126
00:12:35,120 --> 00:12:41,070
growth and I just want to be sure to
convince everybody that it's not about,

127
00:12:41,070 --> 00:12:46,390
when it comes to exponential growth, it's
not about the technology. That is a

128
00:12:46,390 --> 00:12:51,920
dividing line th at no matter how fast
your computer is, you can't be running an

129
00:12:51,920 --> 00:12:57,590
exponential algorithm and expect to be
solving a large problem, in just a thought

130
00:12:57,590 --> 00:13:03,050
experiment to reconvince everybody of
that. So let's suppose that, imagine that

131
00:13:03,050 --> 00:13:09,254
we had a huge, giant, parallel computing
devices. This thing is so big that it's

132
00:13:09,254 --> 00:13:15,605
got as many processes, processors as there
are electrons in the universe. So, we have

133
00:13:15,605 --> 00:13:21,650
a supercomputer on every electron and
we'll put the most powerful super computer

134
00:13:21,650 --> 00:13:27,474
we can imagine today on every electron in
the universe. and then, we'll also run

135
00:13:27,474 --> 00:13:33,371
each processor for the lifetime, estimated
lifetime of the universe. so, how big a

136
00:13:33,371 --> 00:13:39,400
problem can we solve? Well, it's estimated
there's ten to the 79th electrons in the

137
00:13:39,400 --> 00:13:43,994
universe and supercomputers. Nowadays,
maybe can do ten to the thirteenth

138
00:13:44,003 --> 00:13:48,851
instructions per second. and you can
quibble about these numbers and make it

139
00:13:48,851 --> 00:13:53,638
ten to the twentieth if you want. it's
estimated the age of the universe in

140
00:13:53,638 --> 00:13:58,363
seconds is ten to the seventeenth.
And if you don't like that, make it ten to

141
00:13:58,363 --> 00:14:01,984
the thirtieth.
You multiply all those numbers together

142
00:14:02,169 --> 00:14:07,398
you're going to get a, a relatively small
number to say a thousand factorial. ten to

143
00:14:07,398 --> 00:14:13,044
the seventeenth times ten to the
thirteenth times ten to the 79th is only

144
00:14:19,230 --> 00:14:21,868
ten to the 109th or something.
Whereas a thousand factorial is way bigger

145
00:14:21,868 --> 00:14:24,271
than ten to the 1000th. It's more like
1000 to the 1000th, alright? So, there's

146
00:14:24,271 --> 00:14:29,138
no way that you can imagine solving a
thousand city TSP problems using that

147
00:14:29,138 --> 00:14:34,084
brute force algorithm. You can let your
computer run but it's not going to finish.

148
00:14:34,254 --> 00:14:38,745
technology is out of the picture when we
have exponential growth. So, that's why

149
00:14:38,745 --> 00:14:43,236
it's so important to know algorithms are
going to require exponential time and

150
00:14:43,236 --> 00:14:49,272
which ones aren't. in this theory it would
seem. let's at least get that

151
00:14:49,272 --> 00:14:54,867
classification done. so which problems can
we solve in practice? We're just going to

152
00:14:55,286 --> 00:15:01,287
say that the polynomial time algorithms
are the ones that we can solve in practice

153
00:15:01,566 --> 00:15:06,940
that guaranteed polynomial time
performance. so that brings us right to

154
00:15:06,940 --> 00:15:11,615
the question, which problems have
polynomial time algorithms? and

155
00:15:11,824 --> 00:15:18,314
unfortunately, it's not so easy to know
that. get stuck right away on trying to do

156
00:15:18,314 --> 00:15:24,662
that classification, and that's what
today's lecture is all about. so, this is

157
00:15:24,662 --> 00:15:30,495
a similar bird's eye view to when we
introduced the classifying algorithms when

158
00:15:30,495 --> 00:15:36,037
we're talking about reduction. And
reduction's a very important tool in this

159
00:15:36,037 --> 00:15:41,652
theory as well. So, what we're going to
say is a problem is intractable if you

160
00:15:41,652 --> 00:15:47,631
can't guarantee that you can solve it in
polynomial time. so what we want to do is

161
00:15:47,631 --> 00:15:53,433
prove that the problem's intractable.
there have been a couple of problems that

162
00:15:53,433 --> 00:15:59,019
have been proven to require exponential
time. but they are a bit, artificial.

163
00:15:59,019 --> 00:16:04,467
well, so here's one. Given a constant size
program, does it halt, and in most case

164
00:16:04,467 --> 00:16:10,261
that's or here's another one. Given an N
by N checkers board, can the first player

165
00:16:10,261 --> 00:16:15,640
force a win if you have the force capture
rule? Well, N by N checkers board for

166
00:16:15,640 --> 00:16:21,019
large N is again, that's a, certainly a
toward problem. But for the many real

167
00:16:21,019 --> 00:16:26,096
problems that we actually would write
this, like to solve, unfortunately,

168
00:16:26,096 --> 00:16:32,008
there's been very few successes in proving
that a problem is intractable. So even

169
00:16:32,008 --> 00:16:37,580
though it would seem that you would be
able to be able to separate problems that

170
00:16:37,580 --> 00:16:42,676
can guarantee problems polynomial times
solutions from problems that require

171
00:16:42,676 --> 00:16:48,180
exponential time for some input, there's
been very little success in proving that.

172
00:16:48,378 --> 00:16:53,400
but that's an introduction in a motivation
for the theory of intractability.