Your about to embark on the study of the
branch of computer science theory know as
automata theory or language theory. We're
going to follow fairly closely a one
quarter course I gave at Stanford a few
years ago. It's traditional number CS154.
This theory plays an important role in
computer science. I'm aware that many
students do see the importance to the
mathematical approach to CS, the feeling
is just let me near a keyboard and let me
code, It's quite common. In this
introductory video I'll try to give some
reasons why you should learn the material
contained in the course. A number of years
ago Stanford took a survey of its
graduates five years after they got their
undergraduate degrees. What they wanted to
find out was whether Stanford was teaching
stuff that they actually used in their
jobs. So computer science graduates
naturally sided out in conducting
programming courses as the one they used
the most, No surprise here. Next came our
sophomore level course, covering basic
data structures and algorithms and system
software, Again no surprise. But after the
required courses, we found the CS154, the
automata course on which this set of
videos is based was second after the
database course. It was cited by three
times as many as cited the introductory AI
course, for example, So I want, in the
next few minutes, to try to explain what
it was about automata theory that impacted
what our former students were doing in
their professional lives. So, let's see
some of the ideas we are going to learn
about in the course and how they appear in
practice. One very commonly used idea is
the regular expression, A simple notation
for describing many of the patterns that
arise naturally in practice. Many pieces
of software that you might find yourself
working on in the future need a simple
input language to describe patterns. So
you might well find yourself implementing
some form of regular expression. For
example, many UNIX text processing
commands use a variety of regular
expressions. This expression describes a
line of text that has a letter A followed
by any number of characters followed by
the letter B. For a more modern example,
the XML document mark-up language invites
us to describe the structure of documents,
by a DTD, or document type definition. The
DTD language consists of description
development, such as the example given
here. A person element consists of a name
element, followed by an address element,
followed by any number of child elements.
Finite automata are another topic we'll
see early on, They are in fact the way
regular expression based languages are
implemented. They also have been used for
decades to model electronic circuits, and
in particular to help design good
circuits. They have also been used to
model protocols, and we'll give some
examples later in this course. Especially
[inaudible] underlie the body of
techniques known as model checking, Which
has been used to verify the correctness of
both communication protocols and large
electronic circuits. Another important
aspect of the course is the context free
grammar. These are used to put a tree
structure on strings, typically text
strings, according to some recursive
rules. They are an essential for
describing the syntax of programming
languages and are a part of every
compiler. They also play an important role
in describing the syntax of natural
languages and are used in software that
does machine translation and other natural
language processing. And the DTD as we
mentioned in connection with regular
expressions are really context free
grammars. It is the single rules of the
DTD that look like regular expressions. .
The topics we just mentioned are
essentially tools for doing simple but
important things but there's a second
broad theme in this course. There are
fundamental limitations on our ability to
compute. A computer scientist should be
aware of these limitations because only
then can you avoid spending time
attempting something that is impossible.
Mm-hm. One limitation is undecidability
there are problems that cannot be solved
by computation. For example you might
imagine you could, could right a compiler
that would refuse To compile programs that
printed out dirty words. Even assuming you
had a precise definition of what words
were dirty. You can't do this. We're going
to prove that there is no way to tell
whether a program will ever print a
particular word Or even if it will ever
print anything at all. And we also need to
know about the classic problems called
intractable. These are colloquially
problems that we can solve but whose
solution takes time that is exponential in
the input size. These problems generally
need To be finessed in some way such as by
approximating the solution. The reality of
the theory of intractability is quite a
bit different from the colloquial version.
But while the undecidable problems have
been proven not to have any solution, for
the attractable problems we have very
strong evidence that they require an
exponential time but no proof. We'll
explain all this when we get to the theory
of NP completeness is the culmination of
this course. So another thing you will
take away fomr this course is the ability
to navigate the space of problems that you
might encounter in a life of creative
software construction. You will learn to
learn how to determine that a problem is
undecidable, and how to determine that it
is intractable. That lets you avoid the
problem altogether in the first case, and
to modify your approach in the second
case. There's several less concrete
benefits to this course. First you will
improve your skills at proving fact,
especially inductive proofs of which we'll
do several in great detail. Now I'm not
one of the people who thinks formal proofs
of programs will ever be a serious
software methodology, but as you construct
code you should have a sense of why what
you're doing works, the way it is suppose
to. Often the trickiest parts of a program
deal with trees, graphs, or other
recursive structures. Understanding
inductive proofs let you at least
formulate a reason why you think your
method works, even if you don't try to dot
the I's in a formal proof. We're also
going to learn about a number of important
abstractions: finite automata, regular
expressions, context-free grammars, and
varieties of pushdown automata and Turing
machines. Some of the essential parts of
this course are proving equivalences among
the models, that is, any example of one
model can be simulated by some instance of
another model. The process of simulation
across the models is essentially the same
as the modern approach to programming and
layered abstractions where you write
programs at one layer, using the
primitives of the layer below. At
Stanford, I found that a number of people
taking the automata course were not
computer scientists at all, but were
mathematicians [inaudible] by inclination.
That's cool. Welcome to those of you out
there. I probably won't do things quite as
formally as you would like, but more
formally then the typical computer
scientist likes. However, be warned that
some in the past have found the subject
sufficiently interesting, that they saw
the light, and made major contributions to
computer software. A case in point is Ken
Thompson, the fellow who gave us Unix.
Before doing Unix, Ken developed an
interest in regular expressions and saw
that they were an important part of the
text, text editor. He worked out efficient
ways to compile regular expressions into
programs that could process text. And his
algorithms are an important part of what
we teach about the subject today. It
should be no surprise that regular
expressions form an integral part of so
many Unix commands, and that these
commands give more flexibility and power
to Unix users than did those of earlier
operating systems. Another interesting
case is Jim Gray. Jim, before his
mysterious disappearance, gave us many
important ideas in the database field,
including two phase locking for
concurrency control for which he won the
Touring Award. But I knew Jim when he was
a student at Berkeley and I was on leave
there for a brief period. He did a thesis
on two way pushed on automata. We're not
going to talk about them. They turned out
to have been back water of the theory and
I am sure Jim would agree but they're
related to ordinary pushed down automata
which we will talk about in connection to
context regrammars. What is interesting is
that Greg told me quite a bit later that
he decided to become a computer scientist
because automata theory intrigued him.
Only later did he switch into database
systems, and I believe that the experience
he got the aloof fumbling with automata
make him more capable as a designer of
several very innovative systems. So here's
a summary of what will be covered in the
course. We'll start off with what I called
regular languages. A language is just a
set of strings, for example the set of
character strings that are valid java
programs. The regular languages are
exactly those sets of strings that can be
des, described by finite [inaudible] or
regular expressions. This discussion will
also introduce the important concept of
nondetermism. Machines that can magically
turn into many machines that each do
something independently but with a
coordinated effect. This model of
computation is fanciful, to say the least,
but we'll see it plays a really important
role in several places, including design
of algorithms and in understanding the
theory of intractable problems. We're then
going to turn to properties of the regular
languages. These properties include the
ability to answer certain questions about
finite automata and regular expressions
that we cannot decide about programs in
general. An example would be to tell
whether a device makes an output in
response to even one input. You can't tell
for programs in general, but you can for
finite automata. It is distractibility Our
ability to understand very simple formula
sums, like finite automata or regular
expressions do, that make them so very
valuable when they can be used. We're also
gonna talk closure properties of regular
languages for example the union or
intersection of two regular languages is
also a regular language. The next big
topic will be context free languages. This
is a somewhat larger class of languages
that the regular languages. And they
enable us to do things that you can't do
with regular languages such as mash
balance parentheses or XML tags. We'll
talk about two ways to describe such
languages. First by context-free grammars,
A recursive system, system for generating
strings. And then by pushdown automata,
Which are, are a generalization of the
finite automata that we'll use for regular
expressions. We'll then repeat our
examination of decision properties and
closure properties for this larger class
of languages. Many of the things we can
tell about regular languages that we
cannot tell in general, we can also tell
about context free languages. But there
are unfortunately some exceptions.
Similarly, many of the operations under
which regular languages are closed, also
yield a context free language when applied
to context free languages. But again we
lose some operations for example context
free languages that close in reunion but
not intersection. Next we take up the
largest class of languages that we can
reasonable say can be dealt with by a
computer and recursively enumerable
languages. We also look at the smaller but
important subset of the recursively
enumerable languages, called the recursive
languages. These are the languages for
which there is an algorithm to tell
whether or not a string is in the
language. We introduce the Turing machine,
an automaton in the spirit of the first
kinds of automata we meet, finite and
pushdown automata. The Turing machine is,
however, much more powerful than either of
these. In fact, it is as powerful as any
model that has ever been thought of to
answer the question, what can we compute?
The payoff of the study of Turing machines
is that we can answer the question of what
can be decided by computation, or what can
be computed by any means at all. We shall
develop tools for showing that certain
problems cannot be decided. That is
answered by computer. The example we
already mentioned is a program print dirty
word should give you an idea of what we
can achieve. Finally regard to cover the
theory of NP completeness, the critical
question is what can we do with an
algorithm that runs in time that is some
polynomial and a link of the input. It
would be lovely if we could distinguish
between problems as best algorithm took
say in cubic time from those that can be
solved say in squared time. But no theory
has ever been devised to be that precise
for general problems. Although the best
running time for some problems is null,
fortunately we can learn a lot by dividing
problems that can be solved in some
polynomial amount of time from those that
apparently require exponential time. We'll
give you the tools to do this, and Turing
machines are the tool you need to build
this theory as well as the theory of what
can be computed at all. This course
follows the third edition of the textbook
I wrote with John Hopcroft and Rajeev
Motwani. You do not need to buy this book.
The homeworks and exams will all be based
on what you can learn by observing The
slides and listening to my commentary on
the slides. Okay. The following is
optional, and depends on what we decide to
do about, the page references and the
hallmarks. You will incidentally find as
you do the hallmarks that there are
explanations when you make a mistake.
These explanations refer to pages of
sections in this book. But you do not need
the book since the explanations like the
course material itself, are designed to be
self contained without your having to read
the book. I ask one thing however, please
do not download a free copy from a file
sharing site, or delete it if you have
already done so. This book was published
by Addison Wesley under a contract that
dates back to 1967. At that time there was
no internet, and it wasn't all that easy
to produce books. Addison Wesley invested
a good deal in the production of this book
and its predecessors, and they deserve to
recover their costs. It is stealing to
download a copyrighted work without
paying. There's no two ways to look at it.
Admittedly the world has changed, and
recently I've started to put books on the
Web for free download from my own Stanford
Webpages. You are welcome to what is
freely given, but please do not take what
is not offered gratis.