Learn more Learn more

Upcoming Deadlines

Homeworks


Announcements

Challenge Problems 5

Problem 1
The Chain-Selection Problem is: given

1. A budget m,
2. A target t,
3. A directed graph G that consists of disjoint chains only (a chain is a sequence of nodes n_1, n_2,...,n_k, such that the only arcs among these nodes are from n_i to n_{i+1}, for i = 1, 2,..., k-1), and
4. A value v_i for each node n_i,

can we select m nodes, whose values sum to at least t, subject to the constraint that if we select a node, then we must select its predecessor, if any, in its chain?

Example: An example of a chain graph has nodes 1, 2,..., 10 and arcs 1→2, 2→3, 3→4, 6→7, 7→8, 8→9, and 9→10. The chains are 1-2-3-4, 5, and 6-7-8-9-10. In this graph, with a budget of 4, we could choose nodes {1, 2, 5, 6}, or {6, 7, 8, 9}, for example, but we could not choose {1, 2, 7, 8}, because it is not permitted to choose 7 without choosing its predecessor 6 in its chain.

Your question: Is the Chain-Selection Problem NP-complete or is it in P? Either give a reduction to show it is NP-complete or give a polytime algorithm to solve it.

Aside: This problem actually surfaced recently as a problem about materialized-view selection in databases. However, we can see it as one of choosing items to purchase in an environment where it doesn't make sense to purchase one item until you have purchased all the previous items on a chain. For instance, one chain might be "a TV" → "cable service" → "high-def decoder box." It doesn't make sense to buy cable service if you have no TV, and it doesn't make sense to get a decoder box unless you have cable service.

Problem 2
The Quadratic Allocation Problem is: given

1. Nonnegative integers a and b,
2. A list of integers i_1, i_2,..., i_k, and
3. An integer limit l (ell),

can we select a subset of the integers on the list i_1, i_2,..., i_k, such that the sum of ai+bi^2 over all selected integers i equals the limit l? Note: all integers are assumed represented in binary for this problem.

Is the Quadratic Allocation Problem NP-complete or is it in P? Either give a reduction to show it is NP-complete or give a polytime algorithm to solve it.

Aside: This problem arose during some consulting I was doing, where the integers represented the sizes of different software jobs, and the quadratic term is there because the cost of implementing software goes up faster than linearly with the size of the job.

We'll post hints for both these problems on the Challenge-Problem forum.
Sat 7 Dec 2013 9:01 AM CET

Materials for Week 6

We're in the home stretch, with one important topic to learn: "NP-completeness," or more properly the "theory of intractable problems." There are three videos for this week. In the first, number 20, we use the Turing machine to define what it means to solve a problem in some polynomial amount of time. A key fact is that, although the Turing machine can be much slower than a real computer, it can only be *polynomially* slower. That means the answer to the question of whether a given problem can be solved in polynomial time has the same answer whether our model of computation is a Turing machine or a real computer. Video 20 also introduces the important class of problems NP. These are the problems that can be solved by a nondeterministic TM in polynomial time. It appears that at least some problems in NP require exponential time on a real computer or deterministic TM, but we don't know that is the case -- and thereby hangs the mystery of the theory of intractable problems.

Also in Video 20 we meet the idea of a polynomial-time reduction, a way to show that if one problem (say a problem in NP) has a polynomial-time algorithm, then another does. That gives us the notion of an NP-complete problem -- one to which every problem in NP has a polynomial-time reduction. The essential property of NP-complete problems is that if any one of them has a polynomial-time solution, then everything in NP has a polynomial-time solution. Since it appears that no NP-complete problem has a polynomial-time solution, a proof that a problem is NP-complete is almost-but-not-quite a proof that it cannot be solved in polynomial time. Commonly, we interpret a proof of NP-completeness as concluding the problem requires more than polynomial time.

In Video 21, we begin the process of proving certain real problems to be NP-complete. We start with Cook's theorem -- that the question of whether a Boolean expression is satisfiable (has an assignment of truth values to its variables that makes the expression true) is NP complete. Steve Cook proved this by showing how the language of any nondeterministic polynomial-time Turing machine could be reduced in polynomial time to the satistfiability problem.

Video 22 does the proof that two common problems: Node Cover and Knapsack are NP-complete. The proofs are by reducing SAT to each of these problems using a polynomial-time reduction.

A fourth video is a "problem session" from the earlier class; it addresses two common questions on undecidability and intractability.

There is a final homework for this session If you haven't done so, please read the class post about the new times for the final, and hard and soft deadlines for each of the homeworks.

We have also posted a final set of challenge problems for you to consider. These let you try to apply what you learned about NP-completeness to decide if two problems that I encountered in my travels are NP-complete or have polynomial-time algorithms.
Sat 7 Dec 2013 9:01 AM CET

Homework Due Dates

In addition to expanding the dates for the final exam (now 00:01 on Mon. Dec. 16 through 23:59 Mon. Dec. 30), I have modified the due dates for some of the homeworks. I was unable to serve to goals simultaneously:

1. Giving people a reasonable length of time to get the HW done.
2. Letting people see the solutions before taking the final.

As it is, you will be able to see HW 1-4 solutions before the final, but HW 5 and 6 solutions will not appear until 23:59 on Sunday Dec. 22. If you really really need to see those solutions, you have the option of deferring the final and still taking it before Christmas. Here is a summary of the hard and soft due dates; all times are 23:59 on the days listed.

Soft Hard
HW1,2 past 12/15
HW3 12/7 12/15
HW4 12/14 12/15
HW5 12/15 12/22
HW6 12/22 12/22
Thu 5 Dec 2013 12:08 AM CET

Final Exam Schedule

I had planned to make the Week starting Monday Dec. 16 be "finals week" for the class. My motivation was to finish before Christmas, when I assume many people will be on holiday. However, on second thought, I think that many in the class are having trouble keeping up the pace, so the new plan will be to offer the option of taking the exam any time in the period Dec. 16 - Dec. 30. We'll open the exam a minute after midnight on Dec. 16, and keep it open until a minute before midnight on Dec. 30. All times are GMT -8:00 (Pacific Standard Time), of course.
Tue 3 Dec 2013 7:51 PM CET

Materials for Week 5

In week 5 we finish the discussion of Turing machines and undecidability. Our goal is to show that certain problems cannot be solved by Turing machines, and therefore cannot be solved by computers in general (i.e., they are "undecidable"). We start with manufactured problems about Turing machines and use these to prove that some "real" problems -- those not posed simply as examples of undecidable problems -- are also undecidable. A key idea is that of "reduction" from one problem to another. If a problem P is known to be undecidable, and there is a way to use a hypothetical solution to another problem Q to solve P, then we have proved that Q is also undecidable.

There are only three videos this week, but they are in aggregate about as long as the four videos we have been showing in each of the past weeks. Video 17 explores Turing machine theory, showing that some obvious restrictions and obvious generalizations all define exactly the recursively enumerable (RE) languages. This video also looks at the two classes of languages defined by Turing machines: the RE languages and the recursive languages; the latter are languages defined by Turing machines that always halt. We show that each of these language classes have closure properties similar to those of the regular languages or context free languages -- with some differences, of course.

In video 18, we introduce our first undecidable problems, starting with a diagonalization over all Turing machines. Then, in video 19 we explore some real undecidable problems. We prove Rice's theorem, which lets us conclude that essentially everything about what Turing machines (or computer programs) do is undecidable. We introduce Post's Correspondence problem (PCP), a question that does not appear to have anything to do with computation, and prove PCP is undecidable. Finally, we use PCP to prove some questions about CFG's are undecidable; an example is whether a CFG is ambiguous.

There is one new homework for this week, and we'll offer another challenge-problem set (optional).

Finally, there is a third problem-session video, discussing the "HALF" problem and some points that were raised in the discussion forum from the first offering of the class.
Sat 30 Nov 2013 9:01 AM CET

Challenge Problems 4

Problem 1
Let L be the set of Turing machine codes M such that L(M) contains at least two strings. By Rice's theorem, L is not recursive. Show that L is recursively enumerable by constructing a Turing machine that accepts L. The description can be informal. Just remember to give the key ideas so the reader is convinced you understand how the TM works. A Hint is available.

Problem 2
Devise an encoding with a finite alphabet for all context-free grammars with terminal alphabet {0,1}. Remember that CFG's may have any number of variables, so you will have to find a naming system for them. Incidentally, you can extend the encoding (but you don't have to in this problem) so the grammars may have any terminal alphabet. But since the encoding must use a finite alphabet, you would not be able to retain the identity of the terminal symbols. Thus, for example,{0^n 1^n | n >= 1} would look like the same language as {a^n b^n | n >= 1}. For most purposes that is not important. For example the test for emptiness of a CFL does not really depend on what the terminals are named.
Sat 30 Nov 2013 9:01 AM CET

New soft deadline for HW2

In view of the original unusual soft deadline time of homework for week 2 (both parts) and Thanksgiving holiday, we have decided to shift the soft deadline of homework for week 2 (both parts) to Sat., Nov. 30, 11:59pm, PST. The hard deadline is still Dec. 15, 11:59pm, PST. We apologize for the confusion.
Thu 28 Nov 2013 6:55 PM CET

Materials for Week 4

We continue with our study of context-free languages this week, and we'll also start the subject of Turing machines. Video 13 finishes up pushdown automata, with the proofs that they can be used to define all and only the context-free languages. Video 14 is devoted to the pumping lemma for CFL's, and 15 covers both closure and decision properties.

Then, video 16 introduces Turing machines. The Turing machine is the common model for what it means to be able to compute something, and they have played this role since before there were computers. We give the definition of this type of automaton, but most of the time is spent talking about some subtle points necessary to understand what Turing machines can and cannot do: the fact that everything can be encoded as integers or binary strings, and the difference between Turing machines that are guaranteed to halt on any input and those that are not.

There is a new homework, and a second optional programming project. The latter is based on the CYK algorithm for deciding whether a string is a member of a CFL, as discussed in video 15.
Sat 23 Nov 2013 9:01 AM CET

Challenge Problems 3

Problem 1:

Arithmetic expressions with operator + and parentheses can be generated by the grammar

E -> E+E | (E) | a

where a stands for any number. This grammar is ambiguous.

(a) Give an example of a string that has two or more leftmost derivations or parse trees.
(b) Design an unambiguous grammar for the same language.

Problem 2:

The operation HALF(L), discussed in a previous challenge problem, is { w | for some x with the same length as w, wx is in L}. While regular languages are closed under HALF, CFL's are not. Give an example of a CFL L such that HALF(L) is not a CFL. Prove that your language L is a CFL by describing a CFG or PDA for L. Prove that HALF(L) is not a CFL by whatever means you find appropriate, e.g., using the pumping lemma and/or applying some CFL-preserving operation to HALF(L) to produce a language known not to be a CFL. We'll provide a hint later on the forum for challenge problems.
Sat 23 Nov 2013 9:01 AM CET

Pass With Distinction

There has been some discussion of whether there should be the possibility of getting a "statement of accomplishment" with a notation of "with distinction." I've discussed this with ChenGuang Zhu, and we agree that it is appropriate to have such a level, but it should be pretty hard to obtain. So we will general statements "with distinction if you get at leave 85% of the marks. Remember that marks are determined half by your scores on the homeworks (max score for each homework) and half by your score on the final. Since you can, with some effort, get 100% on the homeworks, we're really expecting 70% on the final exam -- not easy but doable. We are still planning to add a notation for those who do both programming projects successfully, but the scores on the projects do not count toward distinction.
Sun 17 Nov 2013 3:13 AM CET

Challenge Problems 2

This week, we have two challenge problems. The first is of medium difficulty, and the second is hard.

Problem 1:

Develop an algorithm that given:

1. A finite automaton A,
2. Two particular states p and q of A, and
3. A regular expression E,

tells whether there is any string in L(E) that takes state p to state q in A.

Problem 2:

Define HALF operation on a language L:

HALF(L) = {x | there exists a string y such that xy is in L and |x|=|y| }

Prove that if L is regular, so is HALF(L).

Example: if L = {0010, 10, 010}, then HALF(L) = {00, 1}. Note that 010, being of odd length, yields nothing for HALF. 0010 yields its first half: 00, and 10 yields its first half: 1.

Note: We are going to post some hints on the forum for challenge problems. You are free to read them or not as you wish. As with all challenge problems, they will not be graded, but you are free to discuss them on the forum.
Sat 16 Nov 2013 9:01 AM CET

Materials for Week 3

This week, we introduce context-free grammars (CFG's). These grammars define the context-free languages (CFL's), a superset of the regular languages. They are an important tool in both compiler construction and in natural-language processing.

In Video 9, we introduce the CFG notation and the related Backus-Naur Form (BNF) notation used to describe most programming languages' syntax today. Video 10 teaches about parse trees and the problem of ambiguity. Video 11 covers simplifications of CFG's -- ways to eliminate certain problematic structures in grammars. Finally, in Video 12, we meet the pushdown automaton, a finite automaton with an additional memory capability: a stack data structure. These automata recognize exactly the CFL's, as we shall learn in week 4.

There is one new homework for week 3.

We also posted two new challenge problems. These are class announcements and also present in the class wiki. They are not graded, but you can discuss them on the Forum created for this purpose.

Incidentally, class registration is still open, but closes at the end of week 3. So if you have a friend who was planning to jump in late, tell them now is the time to jump in.
Sat 16 Nov 2013 9:01 AM CET

Problem Session 2

TA Chenguang Zhu has recorded a video for the second problem session. In the video, Chenguang answers common questions raised by students during the second week of study, as well as shows the solution for challenge problem 1. It will be available on this coming Saturday (Nov. 16, 12:01am, UTC -8:00) in the video lecture. We hope this can help you better understand Week 2's material.
Tue 12 Nov 2013 10:35 PM CET

Challenge Problem 1

Here is the first challenge problem. Feel free to discuss, in the Forum devoted to Challenge Problems, your ideas and solutions.

Let L be the language with alphabet {0, 1, 2} consisting of strings that do not have any three consecutive 0's, any three consecutive 1's, or any three consecutive 2's. Prove that L is a regular language (hint: design automata or regular expressions for some simpler languages and then use closure properties of regular languages to get L). Harder is to design a DFA A for which the language is L itself, but we encourage you try to design one as a second part of this exercise.

Note: this problem is optional and does not count toward your class score.
Sun 10 Nov 2013 9:01 AM CET

Problem Session 1

In the first edition of the Automata course, we recorded some comments that arose from misconceptions that certain students had. We no longer have the ability to develop videos quickly in response to some of the interesting discussions we have had in the fora for the current course. However, what we said last year may be of use to some of you, so we are going to post these videos at the appropriate time. The first one should now be available.
Sun 10 Nov 2013 9:01 AM CET

Materials for Week 2

We have posted the materials for week 2 of Automata. These consist of four videos, two homework sets, an optional programming project, and an optional challenge problem (which is in the form of a class announcement). The videos will finish the work on regular languages. In the first of these (video 5), we introduce regular expressions, and in video 6 we learn that the languages definable by regular expressions are all and only the regular languages. That is, regular expressions and finite automata of all kinds define the same languages.

Video 7 covers what are called "decision properties" of regular languages. We can answer many questions about the languages represented by regular expressions or finite automata that we cannot answer about programs in general. The existence of algorithms to do things like tell whether the language of a finite automaton is empty, or to find a minimum-state equivalent to a given finite automaton is one of the things that makes these automata so useful. Finally in the 8th video we learn about closure properties of regular languages: the fact that we can combine regular languages using many common operations,such as union or intersection, and get another regular language.

I apologize for giving a double homework this week, Because we allow essentially infinite tries, it is important that we bundle questions into groups of about 5. If we grouped them into smaller groups, you could just keep guessing A on each question until you were right. and if we grouped a larger number together, it would be too painful to repeat a whole set if you got one wrong. Thus, we cannot give the problems out in an exactly equal number each week. Since we are fairly liberal in setting deadlines, we hope you will not feel overwhelmed this week.

Also for your consideration is an optional programming project, involving code to translate from regular expressions to epsilon-NFA's, and a challenge problem about a certain regular languages. You are welcome to discuss these, and there are discussion forums for each.

Note: because of a number of requests, the videos and homeworks were released early, We'll probably continue to release these at the beginning of the weekends, rather than the ends, but I don't want to just throw everything at you all at once. Due dates have not changed.
Sat 9 Nov 2013 9:10 PM CET

Course Survey/No Certificates

1. Stanford has asked that all members of the Automata MOOC take two surveys, one at the beginning and one at the end. To take the first survey, please go to Course Wiki on the left menu (the entry above "Final Exam," not the link below it). You will see a table of contents, and if you click on the last one "Initial Survey," the survey will start. Apparently the Wiki link is broken, and I don't know why. But I should have just given you the survey link: https://stanforduniversity.qualtrics.com/SE/?SID=SV_cCOKz9D3vTiHOG9

2. Stanford has also decreed that I am not to refer to the document you will receive on completion of this course as a "Certificate." Rather, it is now called a "Statement of Accomplishment," It will look just like last year's documents, however.

Regards.
---jdu
Mon 4 Nov 2013 8:48 PM CET

Welcome to "Automata."

I want to welcome everyone to the free on-line automata course I have developed. First, I want to acknowledge the help of Zhu ChenGuang, without whose help I never would have been able to put these materials together. ChenGuang will be joining me as a "teaching assistant" for this course and will help in monitoring the class discussions. Both he and I will contribute comments when we think it is appropriate, but most of the time, students will resolve their own discussions.


The course will have a number of homeworks that are designed using the Gradiance technology. The objective of these homeworks is to enable everyone to get 100% and learn the underlying material. While questions look like multiple choice, you should think of them as more conventional "solve this problem and submit the solution" questions. That is, you are given a problem to solve, which you should work completely. Then, you are given a random choice of responses that are designed to figure out whether you got the right solution or not. If you do have the right solution, you should be able to answer the question easily, regardless of the choices presented. If you get it wrong, you will be given a hint and allowed to try again. Your score on a homework is the maximum of any try. We group about 5 questions together, so you can't repeatedly guess each question independently, without actually doing the work.


Each week you will have about 2 hours of video to watch, and either one or two groups of homework questions. The time taken by the videos is about half the time I took to deliver the same material in class. The reason the video goes faster than in-class presentation is that a lot of the "ums" and fumbling are edited out. The speed can be good, if the material is not dense or hard to follow. However, there will undoubtedly be points where the going is rough -- details of a proof or an algorithm. In those cases, I encourage you to take advantage of the fact that the lectures are on video, to repeat material, or pause it to stare at a slide for example.


In the first week's videos, you will begin with an introduction to the whole course. I want to try to convince you of the value of learning the four big concepts that you can take away from this course: finite automata, context-free grammars, undecidable problems, and intractable problems (NP-completeness). The second video is an informal introduction to finite automata. Both these two first videos are "light," and I expect you to have little trouble. The third video introduces deterministic finite automata, and at this point we start to get more formal. In the fourth video, the important concept of nondeterminism is introduced. We learn the remarkable fact that despite the almost "magic" capability of nondeterminism, it does not add power to the finite automaton (although it does make description of many applications of automata a lot easier).


One reminder to bear in mind. In order to get a certificate, you need to get 50% of the marks. Half the marks come from the homeworks, and the other half from the final. You can even pass the course by doing all the homework correctly,and since you are allowed to make multiple tries, that is not impossible. However, the first time I gave this course, a number of people figured they could ignore the homework and just do the final. No one got 100% on the final, so please take the homework seriously.


There will also be two optional programming assignments. These do not affect your grade, but if you do both of them successfully, your certificate will make note of that fact.

Mon 4 Nov 2013 9:01 AM CET