To finish off our discussion of context free languages, let's look at the properties of this class of languages. Remember, there are two kinds of properties we find useful. One is decision properties, where we tell something about a languages or languages in the class. Which is whether the language represented by a grammar or a PDA is empty. And the other is closure properties, where we prove that some operation's a union applied to languages in the class, results in another language in the class. First, let's remember that when we talk about a decision property, or [inaudible]. Property for a language class we're talking about algorithms which take a representation for a language class and produces an answer. For the regular languages, we use regular expressions and deterministic findings as the representations. Here, for context free languages, we use the context free grammar, or the push [inaudible] automaton as the representation, whichever makes like easier. And when we use the PDA, we can let acceptance be by final state or empty stack. Again, whichever makes life easiest. Here are some questions about context free languages for which algorithms exist. Given a representation for a context free language L, and a string W, we can determine whether or not W is in L. We can tell whether a representation for context free language L generates the empty language. Then we can also tell whether this language L is finite or infinite. Unfortunately, many of the things we can decide about regular languages, we can not decide for context free languages. For example, we were able to tell whether two regular languages, say, represented by DFA, were the same. We can't tell whether two context free grammars or two PDAs define the same language. Another thing we cannot tell is whether two context free languages are disjoint. We say set to disjoint if their intersection is empty. That is, they have no members in common. We didn't address the question of disjointness for regular languages but you can tell whether two regular languages are disjoint. We showed regular languages are closed under intersection using the, the product automotron trick. We also showed that you can tell whether the language of an automotron is empty. These two ideas together give you an algorithm to test whether two regular languages are disjoint. At this point we have no way to prove that no algorithm for a task exists. That is the job of the theory of [inaudible] machines and decidability and that will be the next big topic that we address. Now the emptiness test for [inaudible] languages has already been given an essence. We showed how to eliminate useless symbol. Those that participate in [inaudible] of the terminal string. Take any context free grammar and see whether or not the start symbol is useless. If so the language of the grammar is empty. And if not, then not. The membership test for dfa was really simple. To simulate the dfa on the screen, we can also test. Whether a terminal string W is generated by a grammar G. But the test is remarkably hard by comparison. We'll assume G is a grammar and [inaudible] normal form. If not, we know how to convert the given grammar to CNF, so let's do that as the first step. There's a small matter that when we convert the CNF, we lost the ability to generate the empty string. But if W, the string we want to test, if epsilon, then there is another approach entirely. Apply the algorithm we learned for finding the knowable symbols. Those that derive epsilon. See if the start symbol is one of these, and that lets us test membership at the empty string, in the language of the context free grammar. We're going to give an algorithm called CYK. The initials stand for the three people who independently invented the idea. John Cock, Dan Younger, and [inaudible] Kasaki. This algorithm is a great example of a dynamic programming algorithm, and worth, worth seeing for that reason alone. The running time of the algorithm is of the order of N cubed, where N is the length of the input string W. Incidentally, there is another algorithm due to J Early that also runs in time order n-cubed. But on an unambiguous grammar Early's Algorithm is faster it's order n-squared. However the CYK is much simpler and as I said worth studying because it is a model for many useful dynamic programming algorithms. So that is the one we're going to learn. Here's how the CYK algorithm works. Start with an input string of length N. Let A sub I be the symbol in the I-th position. We're going to construct a triangular array with short sides each of lengths N. Each entry of length is a set of entries in the grammar. The set X of IJ which will be in position IJ, and I is equal to or less than J is intended to be the set of variables A that derive the sub string in the input starting in the position I and ending the position J. That's, that is this. We'll use an induction to fill the table. The induction is on the length of the string derived, which is J minus I plus one. So we start by computing the entries X sub II, which is the set of variables which that derive the string which is the one position AI. From these we can find the XI, I plus one's each of which is a set of variable that derive the string AI followed by AI plus one. Then we move to the XII plus two's which are the sets of variables which derive the strings. Length three AI, AI plus one AI plus two and so on. Finally, after we have computed the one set, X1N, which represents the entire input, we can ask ourselves whether S is in that set. If so then S derives from A1 through N and string W is in the language and otherwise not. For the basis, we know that The only way to derive a string of length one in the C and F grammar, is to use a production whose body is a single terminal. So for each I, we can set XII to the set of variables A, such that A goes to AI is a production. For the induction. Where J is strictly greater than I. We can compute x I j from x is representing two sub-strings of a I through h a. The first is a sub strain from AI to AK for sum K less than J. And the second is AK+1 through AJ. Both these strings are of length less than J-I+1. So we have already computed the sets, X of IK. And X sub K+1J. In order for a variable A to derive AI through AJ there must be some production. Say A goes to BC. Where B derives AI to AK, and C derives the rest. That is, for each K between I and J-1, we look for sum B in XIK, and sum C in XK+1J. Such that BC is the body of an A production. If, for any K, B, and C, we find such a production, we add A to XIJ. We're going to do an example of the CYKL algorithm. Here's the CNF grammar we'll use, and the input screen W will be ABABA. It is a length five so we are going to compute variables XIJ for I less than or equal to J and I equal to greater than one and J less than or equal to five, that is a triangular array. For the basis, let's see which variables have a production who's body is A. These variables are the capital A which has this production and capital C which has that production. Thus, if W has this symbol, little A in position I, then XII will be AC. We see A in positions one, three and five of W. That explains X-1-1, X-3-3 and X-5-5. For terminal b we see body b in productions for b and c. That's here and here. Thus if I is a position of W that holds B, XII will be BC. That explains positions two and positions four. Now we need to compute the four entries in the row above. These are the sets of variables that derive substrings of length two. Here's one example. X12. Which is the set of variables which derive the string in the first positions of W, that is AB. When J is I plus one, K can only be I. That is the only way to derive a string of length two in a CNF grammar is to use a production where one unit is replaced by two and each of these variables derives one terminal. So the reason S is in X12 is that A is in X11, B is in X22, and S goes to AB is a production. And the reason B is in x1 two is that A is an X11 C is an X22 and B goes to AC as a production. Notice that C can't be in X12 because C derives only strings of length one. However, A can derive long strings and yet is not an X-1-2. The reason is that the only production A has with the body of two variables is A goes to B-C. That's this. In order for A to be an X-1-2 we would need to find B an X- 1-1 and C an X-2-2. The latter holds but B is not an X-1-1. Here are the other three sets for the [inaudible] correspondent to the substrings of length two. We'll leave it to you to verify that those are correct. Now let's start computing X13. The string on length three can be broken in two different ways. Either a string of length one followed by a string of length two, or vice-versa. In the first case, K=1, and we must combine X11 with X23. X11 has A and C, while X23 has only A. The only bodies we can form from these are AA and CA. But no variable has a, a production with either of these as the body. Thus, K=1 gives us nothing. We must also consider K=2, where we combine X12 with X33. Now, there are four possible bodies, v or s followed by a or c. Of these, only bc is a body of a production of [inaudible] and it's head is a. Thus, X-13 turns out to be just the set containing A. Here are the other two X's with sub-strings of line three. There's a pattern to computing these corresponding to the choices of K that we may make for each. We start at the bottom of the column. For example, for X24. That would be X22. We start going down the diagonal to the right; for X-2-4 that would be X-3-4. So we're going to pair X-2-2 with X-3-4. We then pair them to see what production bodies we can form. And then we move up the column. Here, in this case, the x to three, and down the diagonal, in this case to x. For four. And we pair these two guys to again see what variables we can form from that one followed by that one. Here's how we compute x one four. Starting at the bottom of the column that is x one, one. And the top of it's diagonal that is x two four. We pair these to see if we can form any production bodies. In this case, we can combine A from X11 and B from X24 to put S the head of production with body AB into X14. Now we move up the column to X1,2 and down the diagonal to x3,4. But pairing BRS with BRS doesn't give us any right sides. So we proceed up the column to X13, and down the diagonal to X14. And we pair A with B or C. Ac is a body, and it justifies outputting B into X14. Here are the last two entries in the triangular table. X25 turns out to be the set containing A we'll let you check that one out and X15 is also A. You get a from x14 which has b and x55 which has c. Bc has a body. And A is it's head. Since S is not an X15, we can, can conclude that ABABA is not in the language in the given grammar. We also claim that there is an algorithm to tell whether a context free language is finite or infinite. We won't give the algorithm in detail but it is essentially the same as the time algorithm we gave for regular languages. We use the [inaudible] constant N and as for regular languages, we can show that if a context free language contains any string of length between N and 2N minus one, then that string can be pumped and the language in infinite. Otherwise the language Which is finite. We're now going to enter the area of closure properties. And for many of the same operations under which the class of regular languages are closed, the context free languages are also closed. These include the regular expression operations themselves, union concatenation and closure. And also reversal, homomorphism, and inverse homomorphism. But unlike the class of regular languages, the class of context free languages is not closed under intersection or difference. Here's a proof that the union of two context free languages is a context free language. Let L and M be the context for languages and let them have grammars G and H, respectively. We need to rename variables for one of these grammars so no variables is used for both G and H. The names of the variables don't matter so we can always do this. The reason it is so important is that we are going to throw the productions from G and H into one pile and if they had variables in common then we could accidentally use a production on a variable from H or vice versa. Note that we do not change the terminals of the grammars It's okay if they terminals in common in fact, we expect that they will have terminals in common. Suppose S1 and S2 are the start symbols of the two grammars after renaming the variables. And we'll build the grammar for L [inaudible] and M by combining all the symbols of the two grammars G and H. That is the new set of terminals is the union of the terminals of G and H. And the new set of variables is the union of the variables in G and H plus a new symbol S which is not a symbol of either grammar and will be the start symbol of the new grammar. The new set of productions is the union of the productions of G and H, plus two new productions. S goes to S1, and S goes to S2. All the derivations of the new grammar start with S. And in the first step, this S is replaced by either S1 or S2. If S is replaced by S1, then the entire derivation must be a derivation of G. Because we cannot, then, get any variables of H into our derivation. Similarly, if the first step gives us [inaudible]. Two, then the entire derivation is a derivation of H. Thus terminals strings derivable from S are exactly those in L if we start from S goes to S1. Union those in M if we start with S goes to S2. That is the grammars new language L union M. The argument that the class of context free language is disclosed under concatenation starts the same way with grammars G and H for the languages L and M. These grammars are assumed to have no variables in common and to have the start symbols be S1 and S2 respectively. Again we combine the two grammars as we did for union. The only difference is in the production we use for the new start symbol s. Here there's only one production. S goes to s one followed by s two. That way all strings derived from s will be a string l followed by a string of m. That is the new [inaudible] that will generate l concatenated with m. The [inaudible] star. Let's start with the grammar g for the language l. Let this grammar have a start symbol as one. Form a new grammar, by adding the g, a new start symbol as. And the productions S goes to S1S or the empty string. In a [inaudible] derivation from S, begins from generating zero or more S1s. That is it uses this production as many times as it likes followed by S goes to Epsilon. From each of these S1s we can generate exactly the strings and Ls so the new grammar generates L star. Reversal is another operation for which it is easy to show closure using grammars. If we have a grammar G for the language L, we form a grammar for L reverse by reversing the bodies of all the productions of G. For example, here is a grammar for the language zero to the N, one to the N. If we reverse the bodies, we get this grammar. It is easy to see that this grammar generates all strings of one or more 1s followed by the same number of zeros. That language is the reverse of the language we started with. We're not going to give up proof that this construction works. It is a simple induction The lengths of derivations and the two parameters. To prove closure of this and the context free value language which is under homomorphism, so let's start with a parameter G for a language L and let H be the homomorphism of the terminal symbols of G. And H has a grammar which we can construct by replacing each terminal A in the body of any production of G by the string H of A. So for example here is, G is our customary grammar for G to the N one to the N. And here is our. Usual example of the homomorphism. Then h applied to the language of g has a grammar in which the two occurrences of zero in the productions of g are replaces by a b. And the two occurrences of one are replaced by the empty string Notice that the resulting language is all strings of one or more a b. It is in fact the regular language. Although in general, we can only be sure that it will be a context free language. Next, we take up the fact that context free languages are close to inversal [inaudible]. Well, we seem to have done pretty well using a [inaudible] representation for context [inaudible] languages so far. Here we really need the pda representation. Start with a pda p that accepts the language l by final state. We'll construct another pda p prime which accepts h inverse of l. Where h is [inaudible]. The big idea is the p prime will simulate p. But p prime needs to apply h to every input in both [inaudible]. And since h of A may be a long. String P prime has trouble simulating P in one move and often it cannot do so, so P prime will take it one step at a time. It has a state with two components, the first. Is the state of P which is important in the simulation. But the second is a buffer that holds the suffix of what you get by applying H to someone's symbol. This buffer allows P prime to use the symbols of H of A, one symbol at a time, to cause moves of P. Here's a rough sketch of what P prime looks like. As mentioned, its state has two components, the state of P and the buffer. We show input 0011 as an example only. Now, P prime can read its first input symbol zero, and apply H to it. The buffer which was initially empty now has the string H of zero. It may be a long string but it's length is finite so there's only a finite number of states P prime could be in. Now to simulate P, P prime takes the first symbol of H of zero and simulates P using that as the next input symbol. The simulation can take many moves as they can be transitions on epsilon input as well as one transition on the symbol itself. However the symbol is removed from the front of the buffer so the next time P needs a real input symbol, it gets the [inaudible] Of H of zero. The simulation proceeds in this manner. Until all symbols of h of zero aren't consumed from the buffer. At that point, p prime can apply h to its next input. And refill the [inaudible], the buffer. To be more precise, the states of p prime are pairs q w where q is the state of p. And w is a suffix of h and A, for some symbol A. Note that given p and h there are only a finite number of values of w. And of course p has a finite number of states q. So P prime also has a finite number of states, as is required for any PDA. The stack symbols of P prime are those of P. Moreover, as we shall see, the stack behavior of P Prime mimics that of P. And the start state of P Prime. Q zero epsilon. That is the stars theta of P, paired with an empty buffer. The input symbols of P prime are those symbols A for which H of A is defined. And the final states of P prime are the final states P paired with an empty buffer. Now we'll show how P prime simulates P by giving the transition function delta-prime for P prime. The first type of transition allows P prime to read an input symbol A which must not be epsilon. Apply H to it and store. In the buffer. The buffer of P prime must be empty for this to happen. Although, since P might be able to make moves with epsilon input, P prime is no reforced to refill the buffer just because it is empty. If can also makes moves without consuming its own input. Formally, delta prime of Q epsilon A and X, that's this, has one choice. It can remove the A from its input. And you remember, A is not empty. Place H of A in the buffer, and leave it. Stack and state unchanged. Note that H of A might be empty, in which case, the buffer remains empty. But it is also possible that the buffer now contains one or more of P's input symbols. P prime also has the option to ignore its own input and simulate P as if P's input or whatever it is in the buffer. Formally, suppose. That delta of. Q, B and X. Contains P alpha. Here P may be epsilon or it may be an input symbol of P. And then for any buffer string of the form BW, that is a suffix of H of A for some A, delta prime of Q and the buffer BW with no input with epsilon input and X on the top of the stack will contain. Pw alpha. That is, B is consumed from the front of the buffer. The state of P changes according to the given choice of P's move. And the stack of P prime also changes in accordance with that given move. In order to prove that P prime does what we want it to do, that is, accept H inverse of the language PDAP, we need to do two inductive proofs, one in each direction of the statement that characterizes the way in which P prime simulates P. We're not gonna give the proofs here. The precise statement of P prime stimulates P is given in the middle of the slide. It says that P prime goes from its initial ID with input X. That's this. To some id with state q of p, buffer contents x, w consumed from the input and alpha on its stack, and that's this. If and only if, well, first of all. P goes from its initial ID with input Y. That, to an id that state q, input y consumed, and alpha on its' stat. That's that. And second, H of W is Y. That's what? P consumed. Followed by X, which is the thing that P prime still has left in its buffer. Once we have that, we can restrict it to this case where X is empty, and Q is a final state. It then says that P prime accepts W if and only if P accepts H of W. That is the language of P prime is H inverse of the language of P. We have not yet addressed intersection. Remember that the regular languages are closed under intersection, and then we proved [inaudible] running DFAs for the two languages in parallel. But we can't run two PDAs in parallel and still have a PDA. The reason is that the parallel execution of two PDAs requires two separate independent stacks and a PDA is only allowed to have one stack. That's only an argument that the obvious first try approving context-free languages are closed under intersection won't work, but the situation is worse. We can see particular context-free languages [inaudible] intersection was not a context-free language so no construct could possibly. We said that this language L1, the set of strings of some number zero is followed with the same number of ones and the same number of twos, is not a context free language. The [inaudible] gives us an easy proof of that fact. We're not going to do it here but it's very much like the example [inaudible] proof that we did give. But consider L2 the set of strings in zero star one star two star. That is, strings of zeros followed by some number of ones, followed by some number of twos. Such that the number of zeros and ones are the same with any number of twos. This language is context free, and here is a little grammar for it. The job of variable A is to generate, zeros followed by an equal number of ones. We've seen this mechanism several times before. And B generates just any number of twos, at least one of them. Now let L3 be the set of strings in zero star one star two star. Equal numbers of ones and twos, and with any number of zeros. This language is also context free, and the grammar for L3 uses the same ideas as the grammar we just showed for L2. But L1 is the intersection of context-free languages L2 and L3. We can also show that the difference of two context free languages is not necessarily context free. In fact we can prove something surprising. Intersection can be expressed in terms of difference alone. Therefore any class of language is closed under difference, it is also closed under intersection. The argument is that any, the intersection of any two languages L and M, regardless of whether they're regular context free or not context free, is the difference between L and L-M. That is, suppose X is into L too second M. Okay, than X is surely. Not in L minus M, because it's in both L and M. And therefore, X is in L, and not in L minus M. Therefore, it is in this expression on the, on the right side. That proves containment in one direction. For the other direction, suppose X is in L-L-M, that's this guy here. Then X is in L, and it's not in L-M. But if x is in l but not in l minus m, it must be, that x is also in m. Thus, X is in. L intersect M. That proves containment in the other direction that is X here implied X there and that proves the equivalence of these two expressions. Now suppose the class of context free languages were closed under difference and L and M are context free languages. Then L minus M would be context free and so would this guy, L minus L minus M. But we just proved that this expression is the same as L intersect M, thus context free languages would be closed under intersection but we know they're not So we know that they cannot be closed under intersection but we know they're not so we know that they cannot be closed under difference either. We know that the intersection of two context free languages may not be a context free language. However if we intersect a context free language in a regular language then we always get a context free language. The idea is to run a DFA in parallel with a PDA, since the DFA has not stack, we do not face the problem of trying to simulate two stacks with one that we face Try to run two PDAs in parallel. Here's the picture of a PDA and DFA running in parallel. We could combine the states of the two automata to make one state for new PDA, it manipulates the stack of the original PDA and feeds inputs into both the original PDA and the DFA. It accepts if both the PDA and DFA accept. To give the construction of the new PDA let the DFA have a transition function delta A and the original PDA will have a transition function delta P. States of the new PDA will be pairs. The first component, Q is a state of the DFA and the second component, P is a state of the PDA. Suppose the original PDA has a choice of move from state P in stack symbol X, where A is consumed from the input. That's this. A could be a real symbol or absolute. The result of the move is that the PDA state becomes R and X on the stack is replaced by Alpha. That's the move. Then they [inaudible] PVA who's transition function we call simply delta. Given a state where P is the second component input A and stack symbol X that's this. Has a [inaudible] of move where the new state has second component R. The first component is what you get. By having the dfa make a transition from state q with input A. That's delta A of q and A. It could be [inaudible] here in which case delta A of q A is just q. Or it could be real simple in which delta A of q A is something else. Finally this choice of move replaces x by alpha on the stack. Just as the original pda did. The final stage of the new pda is the [inaudible]. Such that Q and P are final states of their respective [inaudible]. And the initial state of the new PDA is the pair consisting of the initial states of both [inaudible]. We need to prove a pair of inductions on the number of moves made by each PDA. These inductions say that the new PDA started in its initial state with input W, this. Consumes the input, and enters an ID with state QP. Okay, and stack alpha having consumed the input. Okay, and that happens if and only if. The original PDA goes from its initial ID with input W. To the ID with the same state P, and alpha on the stack. And of course the DFA. Goes from it's initial state on input W to the state Q. Skip the details as the proofs are not too hard.