We're gonna find how to express the ideas
behind finite automata more formally.
We'll see how to represent finite automata
either as graphs or tables, They are two
representations of the same idea. And
we're going to do a few proofs in
considerable detail and learn some useful
proof techniques, Especially, inductive
proofs. An alphabet is simply a finite set
of symbols. There are many examples. Two
standard alphabets we see frequently are
the ASCI and Unicode alphabets. An
alphabet we'll use frequently is the
binary alphabet consisting of only the
symbols zero and one. We'll also sometimes
use alphabets of letters like ABC. And in
the tennis example we use the alphabet of
outcomes consisting of the letters s and
o. Each protocol has a set of signals. And
the set of all signals used by a protocol
is also an alphabet. Strings are the usual
data type you've seen in languages like
[inaudible] or Java. That is, a string is
a sequence of symbols chosen from some
alphabet sigma. The strings over sigma are
those strings whose symbols are each
members of sigma. For example 011 Is the
string of over alphabet zero, one [sound],
[sound]. It is also a string over
alphabet, zero, one, and two. It just
happens not to have any twos. Unlike
programming languages, which typically put
quotes around strings, we're not going to
do that. And even though strings are lists
of symbols, we're not going to separate
the symbols by commas or other separators.
We'll just write down the strings like
this. We use the expression Sigma star for
the set of all strings over the alphabet
Sigma. The length of the string is the
number of positions in that string and a
special string is the empty string which
has length zero. You represent it by
epsilon. For example, the language zero
one star consists of all strings that
could be made from the symbols zero and
one. For example, epsilon, the empty
string, is in this language. There are two
strings of length one, the strings zero
and one. And there are four strings of
length two, and so on. Notice that we do
not make a distinction between a symbol
like zero and the string of length one
that consists of single instance of that
symbol, the string zero. We trust that the
context will make clear which we need.
Programming languages do make this
distinction, So in c for example we'd put
single quotes around the zero. If we meant
the symbol or character zero and we put
double quotes around it, if we Intended
the string, consisting of only the zero.
Languages are sets of strings and they can
be finite sets or infinite sets. The only
limitation we place on languages is that
there is some alphabet that is finite set
of symbols from which all the strings in
one language are composed. Here's an
example of a language L, were going to
meet several times. Its alphabet is 01 and
it contains all strings of zeros and ones
that do not have consecutive ones. So the
empty string is there, And both strings of
length one. Of the strings of length two,
all are there except one, one because that
obviously has two consecutive ones. Five
of the eight possible strings of length
three are there, as are eight of the
sixteen strings of length four. Here's a
little puzzle for you to think about. For
length zero through four we see there are
one, two, three, five, and eight strings.
Do you recognize the sequence, and can you
extend it to strings, five, six and so on?
And in particular, can you prove that your
prediction is correct? Now we'll give the
formal notation for deterministic finite
automata. There is a finite set of states,
and we usually use q as the set of states.
Don't ask me why it's q, it just is. And
there is a finite input alphabet. The
traditional name for the input alphabet is
sigma, again, for no known reason. There
is a transition function which we'll
discuss on the next slide. It is the guts
of the automaton since it tells us how the
automaton moves from state to state in
response to inputs. The traditional symbol
for the transition function is delta. One
of the states in q is the star state and
we designated q subzero typically. And
some of the states are designated final
states, or synonymously accepting states,
Final states are typically Denoted F. The
thing that makes the automaton work is the
transition function, This function,
usually denoted by delta, takes two
arguments, a state q, and an input symbol
a. It gives you back the state that the
automaton goes to when it is in state q,
and the next input symbol to arrive is a.
The function delta is total. That is, it
has a value for every state and symbol.
There are examples of automata where we
really don't want to continue in certain
situations. For example, our tennis
automaton did not have transitions out of
the [inaudible]. The states, where one
player or the other has won the game. To
fix up such situations we have to
introduce a dead state, A dead state is a
state that is not accepting. And it has a
transition to itself on every input
symbol. Once you get to a dead state, you
cannot leave it and you cannot ever
accept. So death is a pretty good
description of what is going on. We're
going to use the abbreviation DFA for
deterministic final automaton. The
deterministic means that there's unique
transition for every state and input
symbol. We are going to meet
non-deterministic automatons soon and
there, it's possible to transition too
many states from one state on one input.
Now, back to the matter of a dead state,
here's our tennis example. And you can
notice that the two accepting states do
not actually have any transitions out. So
we have a dead state and all the missing
transitions go to that state. The
transitions from the dead state are to
itself on all possible input symbols. Like
the tennis automaton, we can represent any
finite automaton by a graph. The nodes of
the graph are the states of the automaton.
The arcs of the transitions, there is an
arc from state or node p to state or node
q, labeled by all the input symbols a,
such that delta of p and a is q. For
example, This arch, would be present if
Delta of p and a and Delta of p and b were
both q. You represent the start state by
adding an arrow labeled start into that
state. And we indicate final states by
double circles. This is an interesting
example of an automaton that processes
text. The goal is to recognize that the
string read so far ends in ING. The start
state represents the condition where we
have made no progress towards seeing ING.
If in that state we next see I, we've made
some progress, so we go to the state that
says, I was the last symbol seen.
Otherwise we stay in the start state.
Because what we've seen so far is no help
in seeing ING. Now, from the saw I state.
If we next see an "n", then we have made
more progress. We go to: state saw "n". If
we see another "i", and state saw "i",
then we have not made progress, but
neither have we lost ground. We may be
bringing a word like "skiing", with a
double "i", thus the transition from state
saw "i" and "i" [inaudible] is to itself
and any symbol other than "i" and "n", we
go back from saw "i" to start state and
state: saw "in" If we next see a "G", then
we win. We have just seen ING as the last
three symbols. So we go to the accepting
state saw ING. On the other hand, if from
state saw IN, we next see an I, then the
pattern IN is broken, but a new pattern
beginning with I has started. So we go
into the, saw I state. On any input other
than IOG including n, we have no progress
at all, so go back to the start state.
Finally, from state saw ING, we can only
go state saw I if the next input is I, and
otherwise we must go to the start state.
Here's an automaton that represents the
simplest possible protocol for sending
data. The program is in one of two states,
Ready and Sending. It starts in the ready
state. Eventually it gets some signals
that some data has been loaded into its
buffer and at that point it enters the
sending state, where it does what is
necessary to transmit the content of the
buffer. The receiver will send back an ads
signal when the contents are received, in
which case we are ready to return to the
ready state. However if the receiver is
down, we may instead get a local timeout
signal that warns us something is wrong
and the buffer must be re-transmitted.
This automaton is not complete. There are
no final states because the automaton is
designed to run forever without rendering
a decision. Also, there are missing
transitions. It is okay to ignore act or
timeout signals if you're in the ready
state, staying in the ready state.
However, a data signal in the sending
state is an indication of an error, so we
might want to go to an error state. And
the error state is a dead state, so we
need transitions to itself on any of the
three inputs. A typical question that is
asked about protocols is whether you can
get to an error state. We could for
example make the error state the final.
And then ask if the language of the
automaton is empty. As we shall see there
is an algorithm to tell whether the
language of the [inaudible] of automaton
is empty, A question we could not ask. The
programs in general, in this example
however, it is easy really easy to see
that there is a path in the graph from the
start state to the final state so its
language is not empty and errors are
possible. For our running example were
going to use [inaudible], this language is
the set of binary strings that do not
contain two consecutive ones. State a is
where the [inaudible] would be whenever
the input string seems so far is good,
that is it contains no consecutive ones.
And also in state a, it is not ending on
one. Surely the state should be the start
state when no input is received. There are
no consecutive ones, and moreover, the
input so far does not end in a one. We get
to state b when the input is good, that
is, no two consecutive ones, but the last
symbol seen is a one. Notice the only way
to get to b is to be an a, and then to get
input one. C Is actually a dead state, You
are there whenever two consecutive ones
have been received. We arrive at c for the
first time from b. Which you recall, means
the previous input was one. And in state
b, we receive a second one. Once in state
c, we stay there, because once a string
has 1-1, you can never undo that fact, no
matter how many zeroes you see. We can
also represent automata by tables. Here's
our example automaton for strings without
consecutive ones shown as a transition
graph in the corner. And we also see a
representation of the same automaton as a
table. The rows each correspond to one of
the states. And the columns correspond to
the input samples. We indicate the final
stage by putting a star next to their
name. And the start take Is indicated by
an arrow. The entries of the table are the
values of the transition function applied
to the state that is the row and the
symbol that is the column. For example
this entry Is c because it is in the row
for the state b. And the column for input
symbol one. And we know that delta of b
and one is c. There are two important
conventions we're going to use. They let
us know the types of things without
declaring types. In particular, we can
distinguish between strings and
characters. We use lower case letters at
the end of the alphabet, w, x, y, and Z,
And sometimes U or V to represent strings.
When you see one of these letters, you
need to think, aha, that's a string. And
lower case letters near the beginning of
the alphabet will represent single
symbols. Again, we'll try to remind you at
first. But you have to think single symbol
when you see one of these letters. The
extended transition function delta is a
function that takes a state q and a string
w of any length, including zero. And tells
us where the automaton gets to. It follows
a path in the transition diagram, from q,
where the arcs are labeled by each of the
symbols of w in order. That is, we look
for the unique path from q, whose label's
forms w. In some material, you will see a
hat over the delta to remind you that it
is the standard version. However, as we
shall see, the standard delta agrees with
the given delta when the string w is of
length one, that is, it is a single
symbol. Thus, there is not really a need
to distinguish the standard and original
deltas. The definition of the extended
transition function delta is an induction
on the length of the string to which this
function is applied. For the basis, delta
of q and epsilon is q. That is, if you are
in state q and no input symbols arrive,
then you stay in state q. For the
induction, suppose the input string is WA.
By our convention, w is a string of some
length, possibly zero, and a is a single
symbol. The inductive rules says, that we
first see where you get to from state q,
on string of inputs w. That would be Delta
of QW. And then you see where you get from
that state, Whatever it is on the last
input a. In this example 001 is split into
w into 01 and a which is one. Then, we
have to compute delta of b and 01, and see
where it goes to on input one. Now, 01 is
broken into a string w that is just the
star, zero, and then a, which is one. So
we need to compute delta of b, zero, and
then see where we get to on input 1.'Kay,
But delta of b and zero is a. See, right
here, Thus, And we can replace delta of
being zero by the a, And now we have to
say what is the value of delta of a and
one. Well, that's b. Okay, So that's a b,
And we replace that b there. And then
finally delta, b, and one that's c. So,
that's our answer. Delta of b and zero
one, one is the state c. As I mentioned,
the extended delta sometimes is given a
hat. However we really don't need to
distinguish the two deltas because they
agree on single symbols, that is, if we
want the extended delta for a state q in a
string consisting of one symbol a,
formerly we treat that string as the empty
string followed by the symbol a. Then we
have to compute delta hat of q and epsilon
but we know about the basis rule that this
is q itself. Now if delta hat q and a is a
same state as delta of q and a. In fact, I
really cheated on the previous slide. I
needed delta hat of b and zero and just
went to the table and looked it up as if
it were delta of b and zero. Now we see
they are in fact the same so it was no
harm, no foul. There are many different
kinds of automaton. We have seen only a
bit of deterministic final automatons so
far but there are many others. But no
matter what kind of automaton its job is
to, is to define some language. We'll use
L of a to denote the language defined by
automaton a, For a deterministic find in
automaton. The language it defines is the
set of strings that may take the start
state to a final state. Formally, the
language of a DFA is the set of strings w.
Such that, the extended delta applied to
the start state, and this w, is a final
state. For example, consider our running
example of the automaton that accepts
strings without consecutive ones. And
consider the input, one zero one. It
should accept the string because it
obviously does not have consecutive ones.
The start state is a. And if we follow the
path labeled 101, we first go to b. Then
from b we follow the arch with a zero that
leads us back to a. The theory put in
another one so that gets us back to b
again. Since the path from a labeled one
zero one gets us to the final state, one
zero one is indeed accepted by the
automaton. Now this expression is called
the set former. It describes sets, in
particular while use it to describe
languages. In this example, it's
describing the languages of the automaton
that have served as our running example.
The second formula starts with a curly
brace. And then an expression of the
things we want to put in the set. In this
case the expression simply says that the
set consists of some strings w. We know
they are strings because of our
convention. The vertical bar can be read
such that. And then there follows a
description of what must be true for
something to be a member of the set. In
our example we say that the string
consists of zeros and ones and does not
have two consecutive ones. Many important
facts about automators that we might want
to prove are of the form that two sets,
typically languages of set of strings, are
really the same. We are going to prove
that the DFA we've been playing with
accepts the language I claim it does. The
set of binary strings without consecutive
1's, So one set is the language of the DFA
from our example. And the second set is
the language consisting of all strings of
zeros and ones without two consecutive
ones. I'm going to spend a good deal of
time proving this simple result because it
will give you all the details of how one
proves things about languages. In the
future I'm not going to be so focused on
proofs, but I think it is important that
everyone go through one of these proofs
and all its gory detail. To prove two
sets, s and t are equal, we generally need
to prove two things. That each is
contained in the other. That is we start
by assuming w is a member of one, say s.
And we use that fact to prove w is in the
other t. Then, we start over and we assume
w as in t, and we use that to prove w is
also in s. In what follows, we take s to
be the language of the DFA we have been
playing with, and t to be the set of
binary strings without consecutive ones.
First, we're going to prove that if w is
accepted by this automaton, the one shown
in the slide. Then w has no one, one's.
Okay, and the proof is in the induction on
the link of w. It turns out that if we
simply try to prove the statement on the
slide we fail. And I'll point out in a few
slides what goes wrong. A common trick for
inductive proofs is to make a more
detailed statement here than you really
want, because it makes the inductive proof
work. Here, we need to distinguish whether
an accepted string gets you to state a or
state b. Because we need to know whether
or not the string ends in one, Even though
at the conclusion, no 1-1s is true in both
states a and b. The inductive hypothesis
will be in two parts. Part one says that
if w gets you to state a, then not only is
w good, in the sense that it doesn't have
consecutive ones. Part two says that if w
gets you to state b, then w is still good,
but it must end in one. Remember, the
induction is on the length of w, so the
basis case is when w is the empty string,
the only string of length zero. We're
going to use bars around a string to
denote its length. And now, let's prove
part one of the basis. Delta of a in the
empty string is indeed equal to a. But the
conclusion is true since the empty string
does not have consecutive ones. For item
two things are a little trickier. It is
false that delta of a and the empty string
is b. But it's also false that the empty
string ends in one. However an important
principal of logic is that the statement
false implies false is true. That is
whenever the if portion of the statement
is false it doesn't matter what whether
the then portion is true or not, the
statement as a whole is true. Thus a
statement like, "If I am superman, then I
wear red undershorts." is a true statement
simply because I am not superman. You
don't have to concern yourself with the
color of my undershorts. The mathematical
term for an if then statement that is true
because the if part is false is that the
statement holds vacuously. Now let's
address the inductor step. We assume that
w is a string of [inaudible] at least one.
And we assume that the inductive
hypothesis the statements one and two from
the previous slide. About strings that get
the automaton to states a or b, is true
for strings shorter than w. Let's let w be
xa, where by our convention a is the last
symbol of w, and x is all the symbols,
possibly none, up to but not including the
last symbol of w. Since x is shorter than
w, we assume the inductive hypothesis for
x. We're going to prove both statements,
one and two, for w. Which, remember, we
broke up as a shorter string x followed by
the last symbol a. Let's start with
condition one, that is if delta of a and w
equals a then w is good. It has no
consecutive ones, and it does not end in
one. How can you get to a by reading
string x followed by symbol A? Well, look
at the diagram. The only transitions into
a are on input zero. That's this and that.
So symbol a must be zero. And immediately
that's a [inaudible] that w does not end
in one. Furthermore these transitions to a
are only from a and b. Thus x must get to
a or b. Now regardless of whether x gets
to a or b, we can conclude, using the
inductive hypothesis, that x is good. It
has no consecutive ones. This w which is x
followed by zero also has no consecutive
ones and it surely does not end in one.
Now for part two, If delta of a and w
equals b, then w is good and ends in one.
There's only one way to get to b. You have
to be in state a, and the input has to be
one. Thus, if w equals x, followed by a,
then x gets us to state a, and the symbol
a is a one. We can therefore conclude that
w ends in one. Apply the inductive
hypothesis to x and we conclude that x not
only has no consecutive ones, but doesn't
end in one. Any occurrence of one, one,
and w would either have to be at the end
or lie completely within x. We know it
doesn't lie within x, by the inductive
hypothesis. We know one, one cannot be at
the end, because x does not end in one.
And a is only a single one. We can clue
that the w's does not have consecutive
ones. Notice that if we don't use this
more complicated inductive hypothesis
where we distinguish between a and b
according to whether the string, x ends in
one that we cannot make the inductive
proof. If we know only that x gets us to a
or b then we might think that x gets us to
state a it ends at one, in which case w,
which would be XA, would have two
consecutive 1s. But we're not done, we
still have to prove that t is contained in
s, that is, if s is a good string with no
consecutive 1s that it is accepted by the
automaton. It is helpful the restate what
we need to prove in its contra-positive
form, which is logically equivalent to the
original. The contra positive of an if
then statement, say if x then y, is if not
y, then not x. We can see why this is an
equivalent statement, since if y is false,
it couldn't be that x is true. Because
whenever x is true, y is true. In this
case, x is the statement that w is a good
string with no one, ones. And y is the
statement that w is accepted by the
automaton. The contra positive is that w
is not accepted by the automaton, and w is
not a good string, that is, it contains
eleven as a substring. A deterministic
automaton has exactly one transition from
each state on each input symbol. Thus any
string w gets the automaton from the start
state to exactly one state. So the only
way w could not be accepted is if it gets
this automaton to state c. Notice that the
only way to get the automaton to c is for
some string x to get it to b and then for
an input one to follow. Once in c you stay
in c, so anything can follow the x and
one, We'll call that y. That is Any string
w that gets the automaton to state must
[inaudible] the form x1y where x gets to
be. We already observed that the only way
to get to b is by a string that ends in
one, since the only transition into b is
on one. Thus, x must be the form Z1 for
sum z. Thus, w, is really, z, one, one, y,
and, we can conclude that, if delta of a
and, w, is c, then, w, is bad. That is, it
contains two consecutive ones. Now, we
introduce a class of languages, called the
regular languages. These are the languages
that have a deterministic [inaudible]
accepting them. That means that the
language is, is exactly the set of strings
accepted by this automata. Soon we shall
see that there are several other ways to
describe the regular languages including
by regular expressions and
non-deterministic automata. While many
common languages are regular, there are
also many that are not. Intuitively,
finite automata cannot count beyond a
fixed number. Thus they cannot do things
like check whether they have seen the same
number of zeros as ones on their input. Or
check that parentheses are balanced in an
arithmetic expression. For those tasks we
need more powerful mechanisms such as
context-free grammars, which we will meet
soon enough. Here is an example of a
language that is simple to understand, but
is not a regular language. To understand
what this notation is saying we first need
to know that and exponent I on a symbol is
shorthand for the string consisting of I
copies of that symbol. Thus the zero to
the fourth means the string 0000. Thus, we
read this set [inaudible] as zero to the
n, one to the n, such that n is equal to
or greater than one, or more [inaudible].
The set of strings consisting of n zeroes
followed by n ones for sum n equal to or
greater than one. The strings of the
language l1 are thus 01, 0011, and so on.
Here's another example of a non regular
language, The set of strings that are
balance parentheses. The strings of
balanced parenthesis are those that can
appear in an arithmetic expression for
example the string zero of left paren,
right paren can appear in an expression
like (a+b)c. Or this string, can appear in
an expression like (a+b)(c+d). Examples of
strings that are not balanced are the
right paren followed by a left paren. This
is not balanced because no prefix of a
balanced string can have more left parens
than right parens. Obviously there is no
arithmetic expression that has this
sequence of parentheses as its sole set of
parentheses. Also, three lefts followed by
two rights. It's not balanced, because it
has more left per ends than right.
However, Regular languages are common. For
example in each language there is a format
for floating point numbers. And this
format can be quite complicated with
optional E's or decimal points and strings
of digits, that could be empty. But in all
programming languages I know about, the
set of strings that represent some
floating-point number is a regular
language. This is a very interesting case
illustrating what finite memory means. We
want to know if a binary number is
divisible by 23. We're going to read the
bits high order first. But if we have only
finite memory, how can we remember exactly
what sequence of bits has been read, since
the sequence can grow very long? But there
is a trick. We don't really need to
remember everything about the pitch red.
It is sufficient to remember what the
remainder is when divided by 23. Thus
we'll have 23 states, zero through 22.
These correspond to the 23 possible
remainders when an integer is divided by
23. The start state is zero because we
interpret the empty string as representing
the number zero. That may be a bit of an
assumption but nothing else makes sense
and treating it as zero makes everything
work out right. State zero is also the
only final state since we want inputs that
leave a remainder of zero when divided by
23. We're going to assume that the things
are working right after reading a binary
string value. That is w takes state zero
to the state that is the correct remainder
when w was divided by 23. I'm using the
C-style percent operator to denote the
remainder. You can read the percent as
modulo or mod, which is just another way
of saying the remainder when divided by.
The transition from each state I on input
zero, is to the state that is the
remainder of 2i/23. To see why this works,
we know that i=23a+b, for some integer a
and for some integer b in the range zero
to 22. That is, b equals i mod 23. Then 2i
is 45a+2b, since 46 is divisible by 23, 2i
by 23 is just 2b by 23. That is, when we
divide by 23 we can just get rid of the
46a. We know it will leave you a remainder
of zero. Thus, we can take the state b,
which is the remainder. Double it,
subtract 23 if it is 23 or more. And that
gives us the new remainder, and therefore,
the new state. We never need to know what
a is, and we never need to know I exactly.
For the same reason, when one arrives at
the input, we can go from state I to the
state that is the remainder of 2I+1 when
divided by 23. For example, twice fifteen
is 30. So from state fifteen we go to the
state that is the remainder of 30/23 or
state seven when the input is zero. From
state eleven on input one, we go to the
state that is the remainder of twice
(eleven+1)/23, that's state zero. So
whenever a string gets you to stay at
eleven, that is, the string leaves a
remainder of eleven when divided by 23.
That string, followed by a one, must be
divisible by 23, and therefore, must be
accepted. Interestingly in other regular
languages is a set of all binary strings
that if we read the strings in backwards
that is low [inaudible] byte first the
former binary entity that is divisible by
23. And there is nothing special about 23,
it could be any number in both this
example and the previous one. So for
example 0110100 is in this language
because when we reverse it we get 00101110
which has the value 46 in decimal and 46
is of course divisible by 23. It is really
tricky to design a DFA to accept this
reverse language. However there is a
theorem, which we shall see soon. It says
a language is, if a language is regular
then its reversal were to get to reversing
each of its strings is also a regular
language. The proof of this theorem will
let us construct the DFA for the reverse
language from the DFA we just saw.