Now we are ready to show some particular problems not that have algorithms. The central ideas of this lecture are first, that turn machines can be enumerated. So we can talk about the [inaudible] Turing machine that let's us diagonalize over Turing machines the way we diagonalized over languages, showing a particular language that cannot be the language of any Turing machine. Then we establish the principal that a problem is really a language And we show specific problems not to have [inaudible] machines. To enumerate [inaudible] machines we're going to develop the specific representation of [inaudible] machines as binary strings. It is sufficient to provide a code for touring machines who's input output is 01. Although we could encode others if we wanted to. The reason that it is sufficient to assume only zero and one are inputs is that touring machine codes will be binary strings. So it can focus on touring machines on input output 01 as inputs to other touring machines with the same alphabet. The first thing we need to do is provide integer codes to the components of the touring machine. We give integers one, two, three and so on to the states. We assume that Q1 is always the starred state, and we also assume that Q2 is the one final state of the Turing machine. Notice that once Turing machines enter the final state, nothing further matters. So we can always merge all final states into one. Thus we can restrict our attention to touring machines with a single final state and know that we can still define any recursively enumerable language. The other states will be numbered three, four and so on. The tape symbols that are also numbers starting at one. We'll assume X1 is always the input symbol zero and X2 is always the input symbol one. X3 will always be the blank and any other tape symbols are numbered four and above. They're two directions but we need to number them. D-, D1 is left and D2 is right. Consider a rule expressed using the integer numbering of the components, that is the state QI scanning symbol XJ well suppose the touring machine goes to state QK right symbol XL and moves in the direction Ds of M. Represent this rule with blocks of IJKLM0s separated by single ones. Notice that all the integers are positive so the representation for a rule never has consecutive ones. With them represented in entire tour machine by all its rules concatenated and separated by pairs of ones. Once we have Turing machines represented by binary strings we can convert these strings to unique integers using the trick we explained a while ago. Put a one in front of a binary string and treat the result as a binary integer. Thus, we can talk about the [inaudible] Turing machine. And of course we earlier learned we can talk about the [inaudible] binary string Small matters that some binary represent flawed turn machines for example, it might have a pair of double ones with other then four single ones between them. That would not represent five blocks of 0s and therefore would not represent any [sound]. However let's assume that any binary string that is flawed represents a turn machine excepts the empty language. Likewise, if I is the integer we get from such a string then the Ith turn machine accepts the empty language. So here is a table we're letting Turing machines to the strings they accept. The value in row I and column J is one of the [inaudible] Turing machine accepts the [inaudible] string. And zero if it does not. Notice that if the [inaudible] Turing machine is one of the flawed ones, then it's rows is all zeros. Any matrix of zero and ones with rows and columns corresponding to all the integers can be diagonalized over. That is, we can construct an infinite sequence of zeros and ones and call it D. Such that the [inaudible] of D is the compliment of the bit in position [inaudible] along the diagonal. We can argue that D is not a row of the matrix and therefore does not represent the language accepted by any Turing machine. De cap your row J because it disagrees with the Jth row in their Jth entries. Thus D can not be a row and therefore the language it describes, the language that contains the Ith string, if and only if, the Ith bit of D is one is not the language of any Turing machine. Notice that this language can be described as the set that contains the Ith string, if an only if, the Ith Turing machine does not accept the Ith string. Let's give a name to this language, L sub D with a diagonalization language. Again L sub D is the set of binary strings W. Such that W is the [inaudible] string for some I. And the [inaudible] Turing machine does not accept W. We just argued that L sub D is not a recursively re-numerable language, that is it has no Turing machine. We early approve that since there are more languages than integers and since there are no more turn machines than integers, we already know that there were languages with no turn machine but now we're much better off. We have a particular language L sub D and a description of this language such that L sub D is one of the languages without a turn machine. Note, however that L sub D is a delicate language. We know what it is but we can't, for example, tell whether a given binary string W is in L sub D. To do so, we can figure out the I's such that W is the [inaudible] string. That's the easy part. Now we can write down the code for the [inaudible] Turing machine. In fact, that code is W. If W is the code for a flawed Turing machine, we know it doesn't accept W. But some W's give good codes for Turing machines, and in fact every Turing machine with input alphabet, zero, and one has at least one code W. We can't always figure out whether a Turing machine is gonna accept or run forever without accepting. So for at least some W's we've never learned whether or not that W is an L sub D. Let us introduce the formal concept of a problem. In formally a problem is a yes/no question about an infinite number of possible instances of the problem. Here's an example of a problem that is actually quite famous. And we'll see why soon. The instances of the Hamilton Cycle problem are undirected graphs. There are an infinite number of graphs of course. The answer to the question implied by the Hamilton Cycle problem is yes. If there is a Hamilton Cycle in the graph That is a cycle that passes through each node exactly once. For example, For example, here's a graph that happens to have a Hamilton cycle. It is also got some other edges here and there. But the important thing is that it does have the Hamilton cycle in which every node appears once. So formally a problem is simply a language over some alphabet sigma. Each string and sigma star can be viewed as an instance of the problem once we decide on an encoding for instance as a strings. We'll see a number of examples of these encodings, but we already saw one when we encoded Turing machines as binary strings. It should not be hard for you to devise an encoding for graphs in a similar spirit. In the language associated with the problem, is the set of strings that code instances to which the answer is yes. Typically as we did for turn machines our coding allows certain strings that are flawed. They don't really represent an instance. Well always assume that flawed encodings represent instances for which the answer is no. For example as a problem, the language LD can be stated. Does this turn machine not accept its own code? When we talk about problems, we use the term decidable. It means that there is an algorithm to answer its question. That is a turn machine that accepts the encoded instances for the problem for which the answer is yes and also halts without accepting the other instances. So a decidable problem is really the same thing as a recursive language if we think of the language as encoding a problem. The opposite of decidable is undecidable. So here is what we know about languages so far. In the center we see the recursive languages or as problems, the decidable problems. Then there is a super set of the recursive language called the recursively enumerable languages. These recall other languages accepted by Turing machines with no guarantee that they will halt on inputs that they never accept. And then there is outer space. The uncountably many languages that are not recursively enumerable. They have no Turing machine at all. So far we have one example of a language, L sub D, that lives in this region. Remember that the undecidable problems are all those in either the second ring. Recursively enumerable but not recursive languages. Or an outer space that is everything that is not yellow in this diagram and a big question we need to answer, are there any languages in the second ring? Those that are recursively and enumerable but not recursive. And remember, the real goal of our plan is to show some real problems that are undecidable. The fact that L sub D is undecidable and in fact super-undecidable because it is not even recurs-ably enumerable is interesting but it doesn't by itself tell us anything about the real world. So here are some examples of real problems that are undecidable. Will a program ever reach a particular line of code? Is the given context re grammar ambiguous? Are two given grammar?s equivalent in the sense that they generate the same language? But still staying within the world of Turing machines rather than the real world, but a necessary way station on our trek to the real world is to show a particular language to be recursively enumerable but not recursive. This is the language we call L sub U, the Universal Turing machine language. In more detail, the Universal Turing machine takes its input a binary string consisting of a code of some Turing machine M and some input W for M. Universal Turing machine accepts the coded M and W, if and only if, M accepts W. The idea of the universal Turing machine should not seem strange if you ever contemplated a Java virtual machine. The JVM takes a coded Java program and input for the program and executes the program on the input. In fact, the JVM is more general on the capability than a Turing machine which can only make a single except output. The JVM can cause whatever output the program calls to be made. So let's see how to build an universal Turing machine. First of all, inputs to the Universal Turing machine are in the form: a code for the machine M, three ones, and then the binary string W. Since a valid code for M can never have three consecutive ones, it is never ambiguous what part of the input to the Universal Turing machine is M and what part is W. The Universal Turing machine accepts it's input if and only if that input has a valid code for some Turing machine M and that Turing machine accepts W. So for example, the Universal machine never accepts a string that doesn't have three consecutive ones. We'll design the universal Turing machine as a multi-tape machine. The first tape will hold the input and we never change that. The second tape is used to represent the current tape of M during the simulation of M with input W. We'll discuss this representation shortly. The third tape of the Universal machine simply represents the state of M. The first thing that the Universal machine needs to do is to examine its input and particularly that portion that represents M. As to check that between consecutive double ones there are always five blocks of zeros. There also has to check that a block of three ones appears somewhere on its input, and regards this as the end of rules for M. Finally since M is assumed deterministic, the universal machine needs to check that there are never two rules that agree on the first two components. All this checking will require running back and forth on the input quite a bit. It can be facilitated by copying blocks of 0s onto one of the other tapes and comparing these with the other representation for M. If any flaws are found with this code for M or the 111 is not found then M is regarded as a Turing machine that accepts nothing. So the universal Turing machine immediately halts and rejects its input. Assuming the code for M is valid, the universal Turing machine next examines the code for M to determine how many squares of its own tape its need to represent one simple events tape. That is we discover the longest block of 0s representing a tape symbol and add one to that for a marker between symbols of Ms tape. Thus if say X7 is the highest numbered symbol then we'll use eight squares to represent one symbol of M. Symbol XI will be represented I 0s and seven minus I blanks followed by a marker outside. For example, here's how we would represent X5. Five 0s Two blanks and then the pound sign. Now a [inaudible] is take two to represent the input W. Remember 0s are X1 and 1s are X2. Blanks on the second tape of the universal machine all represent X to X3, the blank of M. But we won't initialize those squares until we need to. Finally we initialize tape three to hold the start state. That state is always Q1 so it is represented by a single zero. Now we're ready to simulate M. We have the current state on tape three and the tape of M is represented on tape two. We scan the moves the M on tape one until we find a move that matches both the state and this tape symbol. If we can't find one then apparently M halts without accepting W so the universal machine does so as well but if we find a match, we'll find right after that on tape one, the new state which we install on tape three replacing the old state. We also find a new tape symbol for which to replace the old tape symbol under the head of tape two. And we also move the tape head one simulated square of M's tape left or right, whichever the move says. And most important, if MM enters an accepting state then the universal machine stops simulating and accepts zone input which, remember, is the pair machine M plus input string W. We claim that the language L sub U is recursively enumerable but not recursive. We just show that there is a turn machine for the language L sub U so surely LU is enumerable. But suppose L sub U were recursive. That means there's a turn machine that always halts and who's language is L sub U. If that were the case then L sub D the diagonalization language would also be recursive. We're going to explain why on the next slide. But we already know that L sub D isn't recursively enumerable, let alone recursive. So let's assume that L sub U is recursive. We construct an algorithm for L sub D as follows. We're given an input W. Let's suppose that W is the Ith string. The first thing to do is to check whether or not W is a valid code for a turn machine. For example that it doesn't have three consecutive 1s. If the code is not a valid string, Then the Ith Turing machine defines the empty language. That means W, the Ith string, is not in the language of the Ith Turing machine. Therefore, W is [inaudible] D. Now suppose W is a valid code for a turn machine. Then simulate the hypothetical algorithm for L sub U on the input W111W. That bit string represents the Ith turn machine processing the input that is the Ith string. Eventually this algorithm will halt and tell us whether or not the Ith machine accepts the Ith string. If the algorithm says yes, the Ith machine accepts the Ith string then we say no because that means W is not in L sub D. However if the hypothetical algorithm says no, then we accept W because W is in L sub D. With proof that there is no algorithm or any kind of turn machine for L sub D therefore we must blame our assumption. The only thing we assume without proof was that there was an algorithm for L sub U. That means that there really isn't an algorithm for L sub U. Put this facts together and we conclude that L sub U is really recursively enumerable but not recursive. So here is our improved version of the universe of languages. We still have the decidable problems of equivalently recursive languages in the center. Outside there are two kinds of undecidable problems. The second ring is the languages that are recursively enumerable but not recursive. We now have a concrete example of such a problem L sub U, the universal turn machine language. And beyond that is the not recursively enumerable languages of which we have one concrete example, L sub D.