1
00:00:00,000 --> 00:00:05,761
Your about to embark on the study of the
branch of computer science theory know as

2
00:00:05,761 --> 00:00:11,176
automata theory or language theory. We're
going to follow fairly closely a one

3
00:00:11,176 --> 00:00:16,730
quarter course I gave at Stanford a few
years ago. It's traditional number CS154.

4
00:00:16,730 --> 00:00:21,567
This theory plays an important role in
computer science. I'm aware that many

5
00:00:21,567 --> 00:00:26,894
students do see the importance to the
mathematical approach to CS, the feeling

6
00:00:26,894 --> 00:00:31,290
is just let me near a keyboard and let me
code, It's quite common. In this

7
00:00:31,290 --> 00:00:35,587
introductory video I'll try to give some
reasons why you should learn the material

8
00:00:35,587 --> 00:00:40,284
contained in the course. A number of years
ago Stanford took a survey of its

9
00:00:40,284 --> 00:00:45,267
graduates five years after they got their
undergraduate degrees. What they wanted to

10
00:00:45,267 --> 00:00:50,132
find out was whether Stanford was teaching
stuff that they actually used in their

11
00:00:50,132 --> 00:00:54,225
jobs. So computer science graduates
naturally sided out in conducting

12
00:00:54,225 --> 00:00:59,068
programming courses as the one they used
the most, No surprise here. Next came our

13
00:00:59,068 --> 00:01:04,032
sophomore level course, covering basic
data structures and algorithms and system

14
00:01:04,032 --> 00:01:09,058
software, Again no surprise. But after the
required courses, we found the CS154, the

15
00:01:09,058 --> 00:01:13,649
automata course on which this set of
videos is based was second after the

16
00:01:13,649 --> 00:01:18,675
database course. It was cited by three
times as many as cited the introductory AI

17
00:01:18,675 --> 00:01:23,390
course, for example, So I want, in the
next few minutes, to try to explain what

18
00:01:23,390 --> 00:01:28,478
it was about automata theory that impacted
what our former students were doing in

19
00:01:28,478 --> 00:01:33,285
their professional lives. So, let's see
some of the ideas we are going to learn

20
00:01:33,285 --> 00:01:38,893
about in the course and how they appear in
practice. One very commonly used idea is

21
00:01:38,893 --> 00:01:43,955
the regular expression, A simple notation
for describing many of the patterns that

22
00:01:43,955 --> 00:01:48,956
arise naturally in practice. Many pieces
of software that you might find yourself

23
00:01:48,956 --> 00:01:53,771
working on in the future need a simple
input language to describe patterns. So

24
00:01:53,771 --> 00:02:00,271
you might well find yourself implementing
some form of regular expression. For

25
00:02:00,271 --> 00:02:05,563
example, many UNIX text processing
commands use a variety of regular

26
00:02:05,563 --> 00:02:11,945
expressions. This expression describes a
line of text that has a letter A followed

27
00:02:11,945 --> 00:02:17,837
by any number of characters followed by
the letter B. For a more modern example,

28
00:02:17,837 --> 00:02:23,684
the XML document mark-up language invites
us to describe the structure of documents,

29
00:02:23,684 --> 00:02:29,259
by a DTD, or document type definition. The
DTD language consists of description

30
00:02:29,259 --> 00:02:34,630
development, such as the example given
here. A person element consists of a name

31
00:02:34,630 --> 00:02:40,069
element, followed by an address element,
followed by any number of child elements.

32
00:02:40,069 --> 00:02:45,372
Finite automata are another topic we'll
see early on, They are in fact the way

33
00:02:45,372 --> 00:02:50,618
regular expression based languages are
implemented. They also have been used for

34
00:02:50,618 --> 00:02:55,217
decades to model electronic circuits, and
in particular to help design good

35
00:02:55,217 --> 00:02:59,694
circuits. They have also been used to
model protocols, and we'll give some

36
00:02:59,694 --> 00:03:04,232
examples later in this course. Especially
[inaudible] underlie the body of

37
00:03:04,232 --> 00:03:09,383
techniques known as model checking, Which
has been used to verify the correctness of

38
00:03:09,383 --> 00:03:14,105
both communication protocols and large
electronic circuits. Another important

39
00:03:14,105 --> 00:03:18,870
aspect of the course is the context free
grammar. These are used to put a tree

40
00:03:18,870 --> 00:03:23,241
structure on strings, typically text
strings, according to some recursive

41
00:03:23,241 --> 00:03:34,832
rules. They are an essential for
describing the syntax of programming

42
00:03:34,839 --> 00:03:35,961
languages and are a part of every
compiler. They also play an important role

43
00:03:35,961 --> 00:03:37,332
in describing the syntax of natural
languages and are used in software that

44
00:03:37,332 --> 00:03:42,900
does machine translation and other natural
language processing. And the DTD as we

45
00:03:42,900 --> 00:03:47,120
mentioned in connection with regular
expressions are really context free

46
00:03:47,120 --> 00:03:52,557
grammars. It is the single rules of the
DTD that look like regular expressions. .

47
00:03:52,564 --> 00:03:56,741
The topics we just mentioned are
essentially tools for doing simple but

48
00:03:56,741 --> 00:04:01,092
important things but there's a second
broad theme in this course. There are

49
00:04:01,092 --> 00:04:05,791
fundamental limitations on our ability to
compute. A computer scientist should be

50
00:04:05,791 --> 00:04:10,026
aware of these limitations because only
then can you avoid spending time

51
00:04:10,026 --> 00:04:15,131
attempting something that is impossible.
Mm-hm. One limitation is undecidability

52
00:04:15,131 --> 00:04:19,657
there are problems that cannot be solved
by computation. For example you might

53
00:04:19,657 --> 00:04:25,120
imagine you could, could right a compiler
that would refuse To compile programs that

54
00:04:25,120 --> 00:04:31,199
printed out dirty words. Even assuming you
had a precise definition of what words

55
00:04:31,199 --> 00:04:36,048
were dirty. You can't do this. We're going
to prove that there is no way to tell

56
00:04:36,048 --> 00:04:40,802
whether a program will ever print a
particular word Or even if it will ever

57
00:04:40,802 --> 00:04:46,242
print anything at all. And we also need to
know about the classic problems called

58
00:04:46,242 --> 00:04:51,079
intractable. These are colloquially
problems that we can solve but whose

59
00:04:51,079 --> 00:04:56,654
solution takes time that is exponential in
the input size. These problems generally

60
00:04:56,654 --> 00:05:01,397
need To be finessed in some way such as by
approximating the solution. The reality of

61
00:05:01,397 --> 00:05:05,914
the theory of intractability is quite a
bit different from the colloquial version.

62
00:05:05,914 --> 00:05:10,376
But while the undecidable problems have
been proven not to have any solution, for

63
00:05:10,376 --> 00:05:14,508
the attractable problems we have very
strong evidence that they require an

64
00:05:14,508 --> 00:05:18,860
exponential time but no proof. We'll
explain all this when we get to the theory

65
00:05:18,860 --> 00:05:23,181
of NP completeness is the culmination of
this course. So another thing you will

66
00:05:23,181 --> 00:05:27,763
take away fomr this course is the ability
to navigate the space of problems that you

67
00:05:27,763 --> 00:05:31,968
might encounter in a life of creative
software construction. You will learn to

68
00:05:31,968 --> 00:05:36,389
learn how to determine that a problem is
undecidable, and how to determine that it

69
00:05:36,389 --> 00:05:40,701
is intractable. That lets you avoid the
problem altogether in the first case, and

70
00:05:40,701 --> 00:05:45,800
to modify your approach in the second
case. There's several less concrete

71
00:05:45,800 --> 00:05:51,156
benefits to this course. First you will
improve your skills at proving fact,

72
00:05:51,156 --> 00:05:56,231
especially inductive proofs of which we'll
do several in great detail. Now I'm not

73
00:05:56,231 --> 00:06:01,059
one of the people who thinks formal proofs
of programs will ever be a serious

74
00:06:01,059 --> 00:06:06,196
software methodology, but as you construct
code you should have a sense of why what

75
00:06:06,196 --> 00:06:11,531
you're doing works, the way it is suppose
to. Often the trickiest parts of a program

76
00:06:11,531 --> 00:06:15,502
deal with trees, graphs, or other
recursive structures. Understanding

77
00:06:15,502 --> 00:06:19,770
inductive proofs let you at least
formulate a reason why you think your

78
00:06:19,770 --> 00:06:24,452
method works, even if you don't try to dot
the I's in a formal proof. We're also

79
00:06:24,452 --> 00:06:29,194
going to learn about a number of important
abstractions: finite automata, regular

80
00:06:29,194 --> 00:06:33,936
expressions, context-free grammars, and
varieties of pushdown automata and Turing

81
00:06:33,936 --> 00:06:39,479
machines. Some of the essential parts of
this course are proving equivalences among

82
00:06:39,479 --> 00:06:44,962
the models, that is, any example of one
model can be simulated by some instance of

83
00:06:44,962 --> 00:06:49,967
another model. The process of simulation
across the models is essentially the same

84
00:06:49,967 --> 00:06:54,680
as the modern approach to programming and
layered abstractions where you write

85
00:06:54,680 --> 00:06:58,558
programs at one layer, using the
primitives of the layer below. At

86
00:06:58,558 --> 00:07:03,137
Stanford, I found that a number of people
taking the automata course were not

87
00:07:03,137 --> 00:07:07,776
computer scientists at all, but were
mathematicians [inaudible] by inclination.

88
00:07:07,776 --> 00:07:12,653
That's cool. Welcome to those of you out
there. I probably won't do things quite as

89
00:07:12,653 --> 00:07:16,875
formally as you would like, but more
formally then the typical computer

90
00:07:16,875 --> 00:07:21,573
scientist likes. However, be warned that
some in the past have found the subject

91
00:07:21,573 --> 00:07:26,391
sufficiently interesting, that they saw
the light, and made major contributions to

92
00:07:26,391 --> 00:07:31,030
computer software. A case in point is Ken
Thompson, the fellow who gave us Unix.

93
00:07:31,030 --> 00:07:35,436
Before doing Unix, Ken developed an
interest in regular expressions and saw

94
00:07:35,436 --> 00:07:40,195
that they were an important part of the
text, text editor. He worked out efficient

95
00:07:40,195 --> 00:07:45,012
ways to compile regular expressions into
programs that could process text. And his

96
00:07:45,012 --> 00:07:49,956
algorithms are an important part of what
we teach about the subject today. It

97
00:07:49,956 --> 00:07:54,862
should be no surprise that regular
expressions form an integral part of so

98
00:07:54,862 --> 00:07:59,769
many Unix commands, and that these
commands give more flexibility and power

99
00:07:59,769 --> 00:08:04,872
to Unix users than did those of earlier
operating systems. Another interesting

100
00:08:04,872 --> 00:08:09,452
case is Jim Gray. Jim, before his
mysterious disappearance, gave us many

101
00:08:09,452 --> 00:08:14,031
important ideas in the database field,
including two phase locking for

102
00:08:14,031 --> 00:08:19,172
concurrency control for which he won the
Touring Award. But I knew Jim when he was

103
00:08:19,172 --> 00:08:23,857
a student at Berkeley and I was on leave
there for a brief period. He did a thesis

104
00:08:23,857 --> 00:08:28,484
on two way pushed on automata. We're not
going to talk about them. They turned out

105
00:08:28,484 --> 00:08:33,055
to have been back water of the theory and
I am sure Jim would agree but they're

106
00:08:33,055 --> 00:08:37,854
related to ordinary pushed down automata
which we will talk about in connection to

107
00:08:37,854 --> 00:08:42,966
context regrammars. What is interesting is
that Greg told me quite a bit later that

108
00:08:42,966 --> 00:08:48,000
he decided to become a computer scientist
because automata theory intrigued him.

109
00:08:48,000 --> 00:08:53,096
Only later did he switch into database
systems, and I believe that the experience

110
00:08:53,096 --> 00:08:58,067
he got the aloof fumbling with automata
make him more capable as a designer of

111
00:08:58,067 --> 00:09:03,289
several very innovative systems. So here's
a summary of what will be covered in the

112
00:09:03,289 --> 00:09:08,260
course. We'll start off with what I called
regular languages. A language is just a

113
00:09:08,260 --> 00:09:12,913
set of strings, for example the set of
character strings that are valid java

114
00:09:12,913 --> 00:09:17,627
programs. The regular languages are
exactly those sets of strings that can be

115
00:09:17,627 --> 00:09:22,427
des, described by finite [inaudible] or
regular expressions. This discussion will

116
00:09:22,427 --> 00:09:27,005
also introduce the important concept of
nondetermism. Machines that can magically

117
00:09:27,005 --> 00:09:31,075
turn into many machines that each do
something independently but with a

118
00:09:31,075 --> 00:09:35,258
coordinated effect. This model of
computation is fanciful, to say the least,

119
00:09:35,258 --> 00:09:39,893
but we'll see it plays a really important
role in several places, including design

120
00:09:39,893 --> 00:09:44,495
of algorithms and in understanding the
theory of intractable problems. We're then

121
00:09:44,495 --> 00:09:49,308
going to turn to properties of the regular
languages. These properties include the

122
00:09:49,308 --> 00:09:54,121
ability to answer certain questions about
finite automata and regular expressions

123
00:09:54,121 --> 00:09:58,582
that we cannot decide about programs in
general. An example would be to tell

124
00:09:58,582 --> 00:10:03,160
whether a device makes an output in
response to even one input. You can't tell

125
00:10:03,160 --> 00:10:07,973
for programs in general, but you can for
finite automata. It is distractibility Our

126
00:10:07,973 --> 00:10:12,610
ability to understand very simple formula
sums, like finite automata or regular

127
00:10:12,610 --> 00:10:17,694
expressions do, that make them so very
valuable when they can be used. We're also

128
00:10:17,694 --> 00:10:22,630
gonna talk closure properties of regular
languages for example the union or

129
00:10:22,630 --> 00:10:27,696
intersection of two regular languages is
also a regular language. The next big

130
00:10:27,696 --> 00:10:33,022
topic will be context free languages. This
is a somewhat larger class of languages

131
00:10:33,022 --> 00:10:38,023
that the regular languages. And they
enable us to do things that you can't do

132
00:10:38,023 --> 00:10:42,821
with regular languages such as mash
balance parentheses or XML tags. We'll

133
00:10:42,821 --> 00:10:47,784
talk about two ways to describe such
languages. First by context-free grammars,

134
00:10:47,784 --> 00:10:52,810
A recursive system, system for generating
strings. And then by pushdown automata,

135
00:10:52,810 --> 00:10:57,469
Which are, are a generalization of the
finite automata that we'll use for regular

136
00:10:57,469 --> 00:11:01,667
expressions. We'll then repeat our
examination of decision properties and

137
00:11:01,667 --> 00:11:06,269
closure properties for this larger class
of languages. Many of the things we can

138
00:11:06,269 --> 00:11:10,698
tell about regular languages that we
cannot tell in general, we can also tell

139
00:11:10,698 --> 00:11:14,897
about context free languages. But there
are unfortunately some exceptions.

140
00:11:14,897 --> 00:11:19,441
Similarly, many of the operations under
which regular languages are closed, also

141
00:11:19,441 --> 00:11:24,294
yield a context free language when applied
to context free languages. But again we

142
00:11:24,294 --> 00:11:29,870
lose some operations for example context
free languages that close in reunion but

143
00:11:29,870 --> 00:11:35,038
not intersection. Next we take up the
largest class of languages that we can

144
00:11:35,038 --> 00:11:40,070
reasonable say can be dealt with by a
computer and recursively enumerable

145
00:11:40,070 --> 00:11:44,452
languages. We also look at the smaller but
important subset of the recursively

146
00:11:44,452 --> 00:11:48,808
enumerable languages, called the recursive
languages. These are the languages for

147
00:11:48,808 --> 00:11:52,619
which there is an algorithm to tell
whether or not a string is in the

148
00:11:52,619 --> 00:11:57,610
language. We introduce the Turing machine,
an automaton in the spirit of the first

149
00:11:57,610 --> 00:12:02,422
kinds of automata we meet, finite and
pushdown automata. The Turing machine is,

150
00:12:02,422 --> 00:12:07,485
however, much more powerful than either of
these. In fact, it is as powerful as any

151
00:12:07,485 --> 00:12:12,422
model that has ever been thought of to
answer the question, what can we compute?

152
00:12:12,422 --> 00:12:17,797
The payoff of the study of Turing machines
is that we can answer the question of what

153
00:12:17,797 --> 00:12:22,985
can be decided by computation, or what can
be computed by any means at all. We shall

154
00:12:22,985 --> 00:12:27,610
develop tools for showing that certain
problems cannot be decided. That is

155
00:12:27,610 --> 00:12:32,547
answered by computer. The example we
already mentioned is a program print dirty

156
00:12:32,547 --> 00:12:37,547
word should give you an idea of what we
can achieve. Finally regard to cover the

157
00:12:37,547 --> 00:12:42,172
theory of NP completeness, the critical
question is what can we do with an

158
00:12:42,172 --> 00:12:47,110
algorithm that runs in time that is some
polynomial and a link of the input. It

159
00:12:47,110 --> 00:12:52,110
would be lovely if we could distinguish
between problems as best algorithm took

160
00:12:52,110 --> 00:12:57,235
say in cubic time from those that can be
solved say in squared time. But no theory

161
00:12:57,235 --> 00:13:02,172
has ever been devised to be that precise
for general problems. Although the best

162
00:13:02,172 --> 00:13:07,012
running time for some problems is null,
fortunately we can learn a lot by dividing

163
00:13:07,012 --> 00:13:11,616
problems that can be solved in some
polynomial amount of time from those that

164
00:13:11,616 --> 00:13:16,516
apparently require exponential time. We'll
give you the tools to do this, and Turing

165
00:13:16,516 --> 00:13:21,356
machines are the tool you need to build
this theory as well as the theory of what

166
00:13:21,356 --> 00:13:25,901
can be computed at all. This course
follows the third edition of the textbook

167
00:13:25,901 --> 00:13:30,564
I wrote with John Hopcroft and Rajeev
Motwani. You do not need to buy this book.

168
00:13:30,564 --> 00:13:35,390
The homeworks and exams will all be based
on what you can learn by observing The

169
00:13:35,390 --> 00:13:40,578
slides and listening to my commentary on
the slides. Okay. The following is

170
00:13:40,578 --> 00:13:46,326
optional, and depends on what we decide to
do about, the page references and the

171
00:13:46,326 --> 00:13:51,514
hallmarks. You will incidentally find as
you do the hallmarks that there are

172
00:13:51,514 --> 00:13:56,701
explanations when you make a mistake.
These explanations refer to pages of

173
00:13:56,701 --> 00:14:02,520
sections in this book. But you do not need
the book since the explanations like the

174
00:14:02,520 --> 00:14:08,479
course material itself, are designed to be
self contained without your having to read

175
00:14:08,479 --> 00:14:13,824
the book. I ask one thing however, please
do not download a free copy from a file

176
00:14:13,824 --> 00:14:19,004
sharing site, or delete it if you have
already done so. This book was published

177
00:14:19,004 --> 00:14:24,449
by Addison Wesley under a contract that
dates back to 1967. At that time there was

178
00:14:24,449 --> 00:14:29,828
no internet, and it wasn't all that easy
to produce books. Addison Wesley invested

179
00:14:29,828 --> 00:14:35,473
a good deal in the production of this book
and its predecessors, and they deserve to

180
00:14:35,473 --> 00:14:40,387
recover their costs. It is stealing to
download a copyrighted work without

181
00:14:40,387 --> 00:14:45,320
paying. There's no two ways to look at it.
Admittedly the world has changed, and

182
00:14:45,320 --> 00:14:50,398
recently I've started to put books on the
Web for free download from my own Stanford

183
00:14:50,398 --> 00:14:54,998
Webpages. You are welcome to what is
freely given, but please do not take what

184
00:14:54,998 --> 00:14:56,313
is not offered gratis.