To finish off our discussion of context
free languages, let's look at the
properties of this class of languages.
Remember, there are two kinds of
properties we find useful. One is decision
properties, where we tell something about
a languages or languages in the class.
Which is whether the language represented
by a grammar or a PDA is empty. And the
other is closure properties, where we
prove that some operation's a union
applied to languages in the class, results
in another language in the class. First,
let's remember that when we talk about a
decision property, or [inaudible].
Property for a language class we're
talking about algorithms which take a
representation for a language class and
produces an answer. For the regular
languages, we use regular expressions and
deterministic findings as the
representations. Here, for context free
languages, we use the context free
grammar, or the push [inaudible] automaton
as the representation, whichever makes
like easier. And when we use the PDA, we
can let acceptance be by final state or
empty stack. Again, whichever makes life
easiest. Here are some questions about
context free languages for which
algorithms exist. Given a representation
for a context free language L, and a
string W, we can determine whether or not
W is in L. We can tell whether a
representation for context free language L
generates the empty language. Then we can
also tell whether this language L is
finite or infinite. Unfortunately, many of
the things we can decide about regular
languages, we can not decide for context
free languages. For example, we were able
to tell whether two regular languages,
say, represented by DFA, were the same. We
can't tell whether two context free
grammars or two PDAs define the same
language. Another thing we cannot tell is
whether two context free languages are
disjoint. We say set to disjoint if their
intersection is empty. That is, they have
no members in common. We didn't address
the question of disjointness for regular
languages but you can tell whether two
regular languages are disjoint. We showed
regular languages are closed under
intersection using the, the product
automotron trick. We also showed that you
can tell whether the language of an
automotron is empty. These two ideas
together give you an algorithm to test
whether two regular languages are
disjoint. At this point we have no way to
prove that no algorithm for a task exists.
That is the job of the theory of
[inaudible] machines and decidability and
that will be the next big topic that we
address. Now the emptiness test for
[inaudible] languages has already been
given an essence. We showed how to
eliminate useless symbol. Those that
participate in [inaudible] of the terminal
string. Take any context free grammar and
see whether or not the start symbol is
useless. If so the language of the grammar
is empty. And if not, then not. The
membership test for dfa was really simple.
To simulate the dfa on the screen, we can
also test. Whether a terminal string W is
generated by a grammar G. But the test is
remarkably hard by comparison. We'll
assume G is a grammar and [inaudible]
normal form. If not, we know how to
convert the given grammar to CNF, so let's
do that as the first step. There's a small
matter that when we convert the CNF, we
lost the ability to generate the empty
string. But if W, the string we want to
test, if epsilon, then there is another
approach entirely. Apply the algorithm we
learned for finding the knowable symbols.
Those that derive epsilon. See if the
start symbol is one of these, and that
lets us test membership at the empty
string, in the language of the context
free grammar. We're going to give an
algorithm called CYK. The initials stand
for the three people who independently
invented the idea. John Cock, Dan Younger,
and [inaudible] Kasaki. This algorithm is
a great example of a dynamic programming
algorithm, and worth, worth seeing for
that reason alone. The running time of the
algorithm is of the order of N cubed,
where N is the length of the input string
W. Incidentally, there is another
algorithm due to J Early that also runs in
time order n-cubed. But on an unambiguous
grammar Early's Algorithm is faster it's
order n-squared. However the CYK is much
simpler and as I said worth studying
because it is a model for many useful
dynamic programming algorithms. So that is
the one we're going to learn. Here's how
the CYK algorithm works. Start with an
input string of length N. Let A sub I be
the symbol in the I-th position. We're
going to construct a triangular array with
short sides each of lengths N. Each entry
of length is a set of entries in the
grammar. The set X of IJ which will be in
position IJ, and I is equal to or less
than J is intended to be the set of
variables A that derive the sub string in
the input starting in the position I and
ending the position J. That's, that is
this. We'll use an induction to fill the
table. The induction is on the length of
the string derived, which is J minus I
plus one. So we start by computing the
entries X sub II, which is the set of
variables which that derive the string
which is the one position AI. From these
we can find the XI, I plus one's each of
which is a set of variable that derive the
string AI followed by AI plus one. Then we
move to the XII plus two's which are the
sets of variables which derive the
strings. Length three AI, AI plus one AI
plus two and so on. Finally, after we have
computed the one set, X1N, which
represents the entire input, we can ask
ourselves whether S is in that set. If so
then S derives from A1 through N and
string W is in the language and otherwise
not. For the basis, we know that The only
way to derive a string of length one in
the C and F grammar, is to use a
production whose body is a single
terminal. So for each I, we can set XII to
the set of variables A, such that A goes
to AI is a production. For the induction.
Where J is strictly greater than I. We can
compute x I j from x is representing two
sub-strings of a I through h a. The first
is a sub strain from AI to AK for sum K
less than J. And the second is AK+1
through AJ. Both these strings are of
length less than J-I+1. So we have already
computed the sets, X of IK. And X sub
K+1J. In order for a variable A to derive
AI through AJ there must be some
production. Say A goes to BC. Where B
derives AI to AK, and C derives the rest.
That is, for each K between I and J-1, we
look for sum B in XIK, and sum C in XK+1J.
Such that BC is the body of an A
production. If, for any K, B, and C, we
find such a production, we add A to XIJ.
We're going to do an example of the CYKL
algorithm. Here's the CNF grammar we'll
use, and the input screen W will be ABABA.
It is a length five so we are going to
compute variables XIJ for I less than or
equal to J and I equal to greater than one
and J less than or equal to five, that is
a triangular array. For the basis, let's
see which variables have a production
who's body is A. These variables are the
capital A which has this production and
capital C which has that production. Thus,
if W has this symbol, little A in position
I, then XII will be AC. We see A in
positions one, three and five of W. That
explains X-1-1, X-3-3 and X-5-5. For
terminal b we see body b in productions
for b and c. That's here and here. Thus if
I is a position of W that holds B, XII
will be BC. That explains positions two
and positions four. Now we need to compute
the four entries in the row above. These
are the sets of variables that derive
substrings of length two. Here's one
example. X12. Which is the set of
variables which derive the string in the
first positions of W, that is AB. When J
is I plus one, K can only be I. That is
the only way to derive a string of length
two in a CNF grammar is to use a
production where one unit is replaced by
two and each of these variables derives
one terminal. So the reason S is in X12 is
that A is in X11, B is in X22, and S goes
to AB is a production. And the reason B is
in x1 two is that A is an X11 C is an X22
and B goes to AC as a production. Notice
that C can't be in X12 because C derives
only strings of length one. However, A can
derive long strings and yet is not an
X-1-2. The reason is that the only
production A has with the body of two
variables is A goes to B-C. That's this.
In order for A to be an X-1-2 we would
need to find B an X- 1-1 and C an X-2-2.
The latter holds but B is not an X-1-1.
Here are the other three sets for the
[inaudible] correspondent to the
substrings of length two. We'll leave it
to you to verify that those are correct.
Now let's start computing X13. The string
on length three can be broken in two
different ways. Either a string of length
one followed by a string of length two, or
vice-versa. In the first case, K=1, and we
must combine X11 with X23. X11 has A and
C, while X23 has only A. The only bodies
we can form from these are AA and CA. But
no variable has a, a production with
either of these as the body. Thus, K=1
gives us nothing. We must also consider
K=2, where we combine X12 with X33. Now,
there are four possible bodies, v or s
followed by a or c. Of these, only bc is a
body of a production of [inaudible] and
it's head is a. Thus, X-13 turns out to be
just the set containing A. Here are the
other two X's with sub-strings of line
three. There's a pattern to computing
these corresponding to the choices of K
that we may make for each. We start at the
bottom of the column. For example, for
X24. That would be X22. We start going
down the diagonal to the right; for X-2-4
that would be X-3-4. So we're going to
pair X-2-2 with X-3-4. We then pair them
to see what production bodies we can form.
And then we move up the column. Here, in
this case, the x to three, and down the
diagonal, in this case to x. For four. And
we pair these two guys to again see what
variables we can form from that one
followed by that one. Here's how we
compute x one four. Starting at the bottom
of the column that is x one, one. And the
top of it's diagonal that is x two four.
We pair these to see if we can form any
production bodies. In this case, we can
combine A from X11 and B from X24 to put S
the head of production with body AB into
X14. Now we move up the column to X1,2 and
down the diagonal to x3,4. But pairing BRS
with BRS doesn't give us any right sides.
So we proceed up the column to X13, and
down the diagonal to X14. And we pair A
with B or C. Ac is a body, and it
justifies outputting B into X14. Here are
the last two entries in the triangular
table. X25 turns out to be the set
containing A we'll let you check that one
out and X15 is also A. You get a from x14
which has b and x55 which has c. Bc has a
body. And A is it's head. Since S is not
an X15, we can, can conclude that ABABA is
not in the language in the given grammar.
We also claim that there is an algorithm
to tell whether a context free language is
finite or infinite. We won't give the
algorithm in detail but it is essentially
the same as the time algorithm we gave for
regular languages. We use the [inaudible]
constant N and as for regular languages,
we can show that if a context free
language contains any string of length
between N and 2N minus one, then that
string can be pumped and the language in
infinite. Otherwise the language Which is
finite. We're now going to enter the area
of closure properties. And for many of the
same operations under which the class of
regular languages are closed, the context
free languages are also closed. These
include the regular expression operations
themselves, union concatenation and
closure. And also reversal, homomorphism,
and inverse homomorphism. But unlike the
class of regular languages, the class of
context free languages is not closed under
intersection or difference. Here's a proof
that the union of two context free
languages is a context free language. Let
L and M be the context for languages and
let them have grammars G and H,
respectively. We need to rename variables
for one of these grammars so no variables
is used for both G and H. The names of the
variables don't matter so we can always do
this. The reason it is so important is
that we are going to throw the productions
from G and H into one pile and if they had
variables in common then we could
accidentally use a production on a
variable from H or vice versa. Note that
we do not change the terminals of the
grammars It's okay if they terminals in
common in fact, we expect that they will
have terminals in common. Suppose S1 and
S2 are the start symbols of the two
grammars after renaming the variables. And
we'll build the grammar for L [inaudible]
and M by combining all the symbols of the
two grammars G and H. That is the new set
of terminals is the union of the terminals
of G and H. And the new set of variables
is the union of the variables in G and H
plus a new symbol S which is not a symbol
of either grammar and will be the start
symbol of the new grammar. The new set of
productions is the union of the
productions of G and H, plus two new
productions. S goes to S1, and S goes to
S2. All the derivations of the new grammar
start with S. And in the first step, this
S is replaced by either S1 or S2. If S is
replaced by S1, then the entire derivation
must be a derivation of G. Because we
cannot, then, get any variables of H into
our derivation. Similarly, if the first
step gives us [inaudible]. Two, then the
entire derivation is a derivation of H.
Thus terminals strings derivable from S
are exactly those in L if we start from S
goes to S1. Union those in M if we start
with S goes to S2. That is the grammars
new language L union M. The argument that
the class of context free language is
disclosed under concatenation starts the
same way with grammars G and H for the
languages L and M. These grammars are
assumed to have no variables in common and
to have the start symbols be S1 and S2
respectively. Again we combine the two
grammars as we did for union. The only
difference is in the production we use for
the new start symbol s. Here there's only
one production. S goes to s one followed
by s two. That way all strings derived
from s will be a string l followed by a
string of m. That is the new [inaudible]
that will generate l concatenated with m.
The [inaudible] star. Let's start with the
grammar g for the language l. Let this
grammar have a start symbol as one. Form a
new grammar, by adding the g, a new start
symbol as. And the productions S goes to
S1S or the empty string. In a [inaudible]
derivation from S, begins from generating
zero or more S1s. That is it uses this
production as many times as it likes
followed by S goes to Epsilon. From each
of these S1s we can generate exactly the
strings and Ls so the new grammar
generates L star. Reversal is another
operation for which it is easy to show
closure using grammars. If we have a
grammar G for the language L, we form a
grammar for L reverse by reversing the
bodies of all the productions of G. For
example, here is a grammar for the
language zero to the N, one to the N. If
we reverse the bodies, we get this
grammar. It is easy to see that this
grammar generates all strings of one or
more 1s followed by the same number of
zeros. That language is the reverse of the
language we started with. We're not going
to give up proof that this construction
works. It is a simple induction The
lengths of derivations and the two
parameters. To prove closure of this and
the context free value language which is
under homomorphism, so let's start with a
parameter G for a language L and let H be
the homomorphism of the terminal symbols
of G. And H has a grammar which we can
construct by replacing each terminal A in
the body of any production of G by the
string H of A. So for example here is, G
is our customary grammar for G to the N
one to the N. And here is our. Usual
example of the homomorphism. Then h
applied to the language of g has a grammar
in which the two occurrences of zero in
the productions of g are replaces by a b.
And the two occurrences of one are
replaced by the empty string Notice that
the resulting language is all strings of
one or more a b. It is in fact the regular
language. Although in general, we can only
be sure that it will be a context free
language. Next, we take up the fact that
context free languages are close to
inversal [inaudible]. Well, we seem to
have done pretty well using a [inaudible]
representation for context [inaudible]
languages so far. Here we really need the
pda representation. Start with a pda p
that accepts the language l by final
state. We'll construct another pda p prime
which accepts h inverse of l. Where h is
[inaudible]. The big idea is the p prime
will simulate p. But p prime needs to
apply h to every input in both
[inaudible]. And since h of A may be a
long. String P prime has trouble
simulating P in one move and often it
cannot do so, so P prime will take it one
step at a time. It has a state with two
components, the first. Is the state of P
which is important in the simulation. But
the second is a buffer that holds the
suffix of what you get by applying H to
someone's symbol. This buffer allows P
prime to use the symbols of H of A, one
symbol at a time, to cause moves of P.
Here's a rough sketch of what P prime
looks like. As mentioned, its state has
two components, the state of P and the
buffer. We show input 0011 as an example
only. Now, P prime can read its first
input symbol zero, and apply H to it. The
buffer which was initially empty now has
the string H of zero. It may be a long
string but it's length is finite so
there's only a finite number of states P
prime could be in. Now to simulate P, P
prime takes the first symbol of H of zero
and simulates P using that as the next
input symbol. The simulation can take many
moves as they can be transitions on
epsilon input as well as one transition on
the symbol itself. However the symbol is
removed from the front of the buffer so
the next time P needs a real input symbol,
it gets the [inaudible] Of H of zero. The
simulation proceeds in this manner. Until
all symbols of h of zero aren't consumed
from the buffer. At that point, p prime
can apply h to its next input. And refill
the [inaudible], the buffer. To be more
precise, the states of p prime are pairs q
w where q is the state of p. And w is a
suffix of h and A, for some symbol A. Note
that given p and h there are only a finite
number of values of w. And of course p has
a finite number of states q. So P prime
also has a finite number of states, as is
required for any PDA. The stack symbols of
P prime are those of P. Moreover, as we
shall see, the stack behavior of P Prime
mimics that of P. And the start state of P
Prime. Q zero epsilon. That is the stars
theta of P, paired with an empty buffer.
The input symbols of P prime are those
symbols A for which H of A is defined. And
the final states of P prime are the final
states P paired with an empty buffer. Now
we'll show how P prime simulates P by
giving the transition function delta-prime
for P prime. The first type of transition
allows P prime to read an input symbol A
which must not be epsilon. Apply H to it
and store. In the buffer. The buffer of P
prime must be empty for this to happen.
Although, since P might be able to make
moves with epsilon input, P prime is no
reforced to refill the buffer just because
it is empty. If can also makes moves
without consuming its own input. Formally,
delta prime of Q epsilon A and X, that's
this, has one choice. It can remove the A
from its input. And you remember, A is not
empty. Place H of A in the buffer, and
leave it. Stack and state unchanged. Note
that H of A might be empty, in which case,
the buffer remains empty. But it is also
possible that the buffer now contains one
or more of P's input symbols. P prime also
has the option to ignore its own input and
simulate P as if P's input or whatever it
is in the buffer. Formally, suppose. That
delta of. Q, B and X. Contains P alpha.
Here P may be epsilon or it may be an
input symbol of P. And then for any buffer
string of the form BW, that is a suffix of
H of A for some A, delta prime of Q and
the buffer BW with no input with epsilon
input and X on the top of the stack will
contain. Pw alpha. That is, B is consumed
from the front of the buffer. The state of
P changes according to the given choice of
P's move. And the stack of P prime also
changes in accordance with that given
move. In order to prove that P prime does
what we want it to do, that is, accept H
inverse of the language PDAP, we need to
do two inductive proofs, one in each
direction of the statement that
characterizes the way in which P prime
simulates P. We're not gonna give the
proofs here. The precise statement of P
prime stimulates P is given in the middle
of the slide. It says that P prime goes
from its initial ID with input X. That's
this. To some id with state q of p, buffer
contents x, w consumed from the input and
alpha on its stack, and that's this. If
and only if, well, first of all. P goes
from its initial ID with input Y. That, to
an id that state q, input y consumed, and
alpha on its' stat. That's that. And
second, H of W is Y. That's what? P
consumed. Followed by X, which is the
thing that P prime still has left in its
buffer. Once we have that, we can restrict
it to this case where X is empty, and Q is
a final state. It then says that P prime
accepts W if and only if P accepts H of W.
That is the language of P prime is H
inverse of the language of P. We have not
yet addressed intersection. Remember that
the regular languages are closed under
intersection, and then we proved
[inaudible] running DFAs for the two
languages in parallel. But we can't run
two PDAs in parallel and still have a PDA.
The reason is that the parallel execution
of two PDAs requires two separate
independent stacks and a PDA is only
allowed to have one stack. That's only an
argument that the obvious first try
approving context-free languages are
closed under intersection won't work, but
the situation is worse. We can see
particular context-free languages
[inaudible] intersection was not a
context-free language so no construct
could possibly. We said that this language
L1, the set of strings of some number zero
is followed with the same number of ones
and the same number of twos, is not a
context free language. The [inaudible]
gives us an easy proof of that fact. We're
not going to do it here but it's very much
like the example [inaudible] proof that we
did give. But consider L2 the set of
strings in zero star one star two star.
That is, strings of zeros followed by some
number of ones, followed by some number of
twos. Such that the number of zeros and
ones are the same with any number of twos.
This language is context free, and here is
a little grammar for it. The job of
variable A is to generate, zeros followed
by an equal number of ones. We've seen
this mechanism several times before. And B
generates just any number of twos, at
least one of them. Now let L3 be the set
of strings in zero star one star two star.
Equal numbers of ones and twos, and with
any number of zeros. This language is also
context free, and the grammar for L3 uses
the same ideas as the grammar we just
showed for L2. But L1 is the intersection
of context-free languages L2 and L3. We
can also show that the difference of two
context free languages is not necessarily
context free. In fact we can prove
something surprising. Intersection can be
expressed in terms of difference alone.
Therefore any class of language is closed
under difference, it is also closed under
intersection. The argument is that any,
the intersection of any two languages L
and M, regardless of whether they're
regular context free or not context free,
is the difference between L and L-M. That
is, suppose X is into L too second M.
Okay, than X is surely. Not in L minus M,
because it's in both L and M. And
therefore, X is in L, and not in L minus
M. Therefore, it is in this expression on
the, on the right side. That proves
containment in one direction. For the
other direction, suppose X is in L-L-M,
that's this guy here. Then X is in L, and
it's not in L-M. But if x is in l but not
in l minus m, it must be, that x is also
in m. Thus, X is in. L intersect M. That
proves containment in the other direction
that is X here implied X there and that
proves the equivalence of these two
expressions. Now suppose the class of
context free languages were closed under
difference and L and M are context free
languages. Then L minus M would be context
free and so would this guy, L minus L
minus M. But we just proved that this
expression is the same as L intersect M,
thus context free languages would be
closed under intersection but we know
they're not So we know that they cannot be
closed under intersection but we know
they're not so we know that they cannot be
closed under difference either. We know
that the intersection of two context free
languages may not be a context free
language. However if we intersect a
context free language in a regular
language then we always get a context free
language. The idea is to run a DFA in
parallel with a PDA, since the DFA has not
stack, we do not face the problem of
trying to simulate two stacks with one
that we face Try to run two PDAs in
parallel. Here's the picture of a PDA and
DFA running in parallel. We could combine
the states of the two automata to make one
state for new PDA, it manipulates the
stack of the original PDA and feeds inputs
into both the original PDA and the DFA. It
accepts if both the PDA and DFA accept. To
give the construction of the new PDA let
the DFA have a transition function delta A
and the original PDA will have a
transition function delta P. States of the
new PDA will be pairs. The first
component, Q is a state of the DFA and the
second component, P is a state of the PDA.
Suppose the original PDA has a choice of
move from state P in stack symbol X, where
A is consumed from the input. That's this.
A could be a real symbol or absolute. The
result of the move is that the PDA state
becomes R and X on the stack is replaced
by Alpha. That's the move. Then they
[inaudible] PVA who's transition function
we call simply delta. Given a state where
P is the second component input A and
stack symbol X that's this. Has a
[inaudible] of move where the new state
has second component R. The first
component is what you get. By having the
dfa make a transition from state q with
input A. That's delta A of q and A. It
could be [inaudible] here in which case
delta A of q A is just q. Or it could be
real simple in which delta A of q A is
something else. Finally this choice of
move replaces x by alpha on the stack.
Just as the original pda did. The final
stage of the new pda is the [inaudible].
Such that Q and P are final states of
their respective [inaudible]. And the
initial state of the new PDA is the pair
consisting of the initial states of both
[inaudible]. We need to prove a pair of
inductions on the number of moves made by
each PDA. These inductions say that the
new PDA started in its initial state with
input W, this. Consumes the input, and
enters an ID with state QP. Okay, and
stack alpha having consumed the input.
Okay, and that happens if and only if. The
original PDA goes from its initial ID with
input W. To the ID with the same state P,
and alpha on the stack. And of course the
DFA. Goes from it's initial state on input
W to the state Q. Skip the details as the
proofs are not too hard.