Good afternoon. Spring has hit California. You can see that I am wearing one of my Hawaiian shirts unlike the button downs that I recorded the lectures in, so everything is feeling good and in this video, I'd like to discuss a few questions that have arisen, in the discussion forum, that I think may represent common misconceptions. First, I want to remind everyone to be aware of the types of things that we discussed. Strings versus characters and so on. Then I want to point out a subtle difference between an automaton accepting a string and accepting a language. And finally there are a few interesting edge effects that we need to understand. You may have heard of the Zen Koan, what is the sound of one hand clapping? I don't know, but here we face a harder problem under the sound of no hands clapping, I want to remind people what it means for example to compute the sum of zero integers. We're familiar with types or classes from work with most any modern programming language. The things we talked about in automata theory also have types, and confusing one type for another is just as dangerous here as it is in programming. Two distinctions I want to emphasize today are first, between characters and strings, and then between sets and the members of those sets. A string is a sequence of zero or more characters. In most programming languages, double quotes are placed around the string. And single quotes around the character. The distinction is most important when the string is of length one because then it looks just like a character if you don't do something like double quoted to indicate its type. Most importantly, epsilon is the way we represent the empty string. In most languages, the way to represent the empty string is by double quotes with nothing in between them like this. And I think most confusion occurred when people noticed that in an epsilon NFA, some arcs are labeled by the string epsilon while others are labeled by input characters or input symbols. Previously, DFAs and ordinary NFAs had all arcs labeled by characters but epsilon is not a character and yet we appeared to use it where you expected an input symbol, that is, a character to appear. But there is no real problem, since a character can be coerced to be a string of length one. It's in fact very natural in programming languages. For example in Java, an easy way to convert characters to strings is to write the empty string concatenated with that character. Since characters are coerced to strings, the character, say, zero, is converted to the string zero and concatenated with the empty string which leaves just the original character, but converted to a string of length one. So for an Epsilon NFA, just think of the character's labeled in the arcs as strings of length one. Then you can concatenate the epsilons and characters along any path and get a string naturally. As with any kind of automaton, the sequence of labels along any path is of type string. Here's a pretty picture of a path in an epsilon NFA. It doesn't matter whether the states are distinct or if some repeat along the path. You just concatenate the labels Epsilon, the string zero, the string one, the string epsilon again, string zero. The concatenation of these five strings then is the string 010. Now let's move on to sets and elements. Elements come in many different types. For example integers, strings, characters and so on. Sets can have elements as members, so for example a language in the world of a automata theory is set of elements of type string. That is, the type of a language is: set of strings. And remember that epsilon is a string. While the empty set is a set. There is a notion of membership that relates the set to the members of that set. A set can have members. The element types like string do not have members. Technically the members of a set can be sets themselves but we're not going to talk about such sets much. Occasionally, we mention the power set of a set S whose members are the each of the subsets of the set S and thus are sets themselves. But, in most cases you can assume that sets have members that are elements and not sets. The empty set is a set but it has no members. All other sets do have one or more members. And elements do not have members either for example epsilon has no members, but the reason epsilon has no members is because it is of a type for which having members makes no sense. We must not confuse the empty string with the empty set. They each have no members, but for very different reasons and their types are not the same. Let's see how the set element distinction applies to states of a non-deterministic finite automaton. First, states of any automaton are elements not sets. The subset construction seems to create a DFA who states, are sets of the states of the NFA. But really, what we're doing is computing sets of NFA states and giving each one a name. This name is the name of a DFA state and each DFA state corresponds to a set of NFA states, but the DFA state is an element with an associated value and the value is the set of NFA states. We can even use the set of NFA states as the name of the DFA state but we should then understand the notation for the set like the set containing p and q, is this, is a string that is the name of a DFA state. Here's another point that has caused some confusion, Automata accept strings. The labels of the paths that lead from the start state to an accepting state. But we also said that an automaton accepts a language. This language is the set of strings that the automaton accepts. If we say automaton A accepts language L then we mean that L consists of all and only the strings that A accepts. Thus, while the typical automaton accepts an infinite number of different strings, it accepts exactly one language, the set of strings that it accepts. Let's remember that the phrase "automaton A accepts language L" means that L is the one language of A. A accepts all the strings in L and A does not accept any other strings. The extreme case of the confusion would be an automaton like the one shown which accepts every string, but only accepts one language, the one we refer to as "zero one star". That's this. That is, it's the language of all strings of zeros and ones. If you don't see the difference between accepting strings and accepting a language, you would erroneously conclude that this automaton accepts every language whatsoever such as the set of zero to the n, one to the n such that n is equal to or greater than one. It doesn't. It accepts this language, the set of all strings of zeroes and ones. I now want to talk about a point of confusion that comes up in several places. What happens when you try to apply an operation to zero things? For example, we know what it means to sum two integers or ten integers but what if we were asked to sum zero integers or if I have a string of zeroes and ones, it makes sense to ask whether the string has an even or odd number of zeroes but if the string is empty? So for example, four + seven + three = fourteen, no problem. If I just want to sum four + seven, cross out the three, again no problem that's eleven. If I just want to sum the four, no seven there, fine: that sum is four. Now if I take the four away, what's left? What is its sum? I claim that in general, we should take the operation applied to nothing to be the identity for that operation. For sum, the identity is zero. That is, zero plus anything is that other thing. This is a well accepted convention. I can't prove to you... that the sum of zero things must be the identity zero, but neither have I seen a reasonable justification for any other convention and if you have one it would be a great topic for the discussion forum of the class. And it is natural and intuitive, as we can see If we write the obvious quote to sum the n elements of an array. Here's what the code looks like: you initialize the sum to zero, there, and then go through the array in a loop. And you go through n times. You'll get the correct sum for any n equal to or greater than one but what happens if n happens to be zero? If that's the case, you never execute the loop. You just jump right around it and the sum is left at zero. And initializing the sum to anything but zero makes no sense and would not give the correct result for n equal to or greater than one unless you did serious contortions to the code. Here are some other examples where the identity for the operator is the only thing that makes sense. The OR of zero propositional variables is false because false is the identity for OR, that is false OR x = x. Similarly, the AND of no propositions is true because true AND x = x. The product of no integers or reals is one because one x = x. And if we concatenate zero strings, we should get the empty string since epsilon concatenated with any string x is x. As an example of why the convention for strings make sense, look at an automaton where the start state is also an accepting state. This automaton has more to it, I just didn't draw it. There is a path from a to a that goes through no arcs. It is of length zero. In general, the string that labels a path is the concatenation of all the strings that come from the input symbols along that path. Again, remember that the characters labeling the arcs are converted to strings for concatenation. But in this case, the path has no arcs so its label must be the concatenation of zero strings, that is, the empty string. That's great because it means that the empty string is accepted by this automaton and since the automaton starts out in an accepting state before it reads any input, that makes sense. Next, let's look at the question of whether zero is odd or even. Curiously, this question causes a great deal of disagreement. For example, I remember one day in 1973 when we had gas rationing, you could only fill up on a day whose number was even, if the digits on your license plate formed an even number. And ditto for odds. But, what if your license plate had no digits? A vanity plate like LOVER, okay. Since legislatures are not Mathematicians, different states have different policies. Some treating a plate like this as odd and others as even. But we know that zero is even because it leaves remainder of zero when divided by two. That is zero is two times an integer namely zero in this case plus a remainder of zero. And anything that leaves, that is of the form two times any integer + zero is even. Now, the empty string has zero of every character and zero is even. So if we ask a question like, does the empty string contain an even number of some character like zero? The answer is yes and if we ask whether it has an odd number of zero, the answer is no, okay. Here's a bonus answer, okay. A few people actually use automaton and automata correctly. Let's learn to be pedants and use them right, okay. Automaton is a singular noun and its plural, an irregular plural because of its Greek origin is automata, one automaton, two automata. But it gets worse, okay? Everybody says a automata theory, never the singular form, automaton theory. But other theories use a singular form of the noun that describes what they are about. For example, physicists talk about String Theory or Quantum Theory, never Strings theory or Quantums theory well, that should be Quanta Theory anyways since Quantum is another irregular plural and don't ask me why. Anyway, see you in class.