Learn more Learn more

Invite your friends!



Upcoming Deadlines


Announcements

Final exam info

1. There are 20 questions, two points each, for a total of 40 points. All questions are about material covered in the required (i.e., non-advanced) videos. Over half of the questions are directly from the lecture material. Several questions are repeats or minor variations of previous problem set questions. There is a small number of questions that you haven't seen before (at the level of new problem set questions).

2. You should be able to complete the exam in 90 minutes. To address any potential technical issues --- dropped Internet connections, etc. --- I'm doubling the amount of time and allowing you 3 full hours to complete the exam. Due to the number of students in the course, we won't be able to grant
additional special accommodations for anybody.

3. You can take the exam anytime between now and the end of March 24th (US California time, as usual).

4. Unlike the problem sets, you have only ONE attempt, and you MUST submit your answers within 3 hours of beginning the exam. Thus DO NOT START THE EXAM until you are confident that you can concentrate on it for the following three hours.

5. When you are taking the exam, be sure to save your answers frequently. To be safe, I suggest that you submit your final answers several minutes before the time expires.

6. Certificates of accomplishment will be processed at some point after March 24th. Details will be sent in a separate email.
Sun 10 Mar 2013 5:11 PM CET

Week 6 Overview

SUMMARY: This week wraps up our discussion of data structures. We'll finish with a bang by covering hash tables --- certainly one of the most important data structures --- in detail. First (Part XIV) we discuss hash table basics --- the supported operations, the types of problems they're useful for, and an overview of implementation approaches. Part XV has one required video about "pathological data sets" for hash functions, and three optional videos that cover great stuff --- how randomization dodges pathological data sets, a construction of simple hash functions that are guaranteed to (in expectation over the choice of a random hash function) spread every data set out evenly, and performance analyses of hashing with both chaining and open addressing. Part XVI contains two required videos on the implementation and performance of bloom filters, which are like hash tables except that they are insanely space-efficient and occasionally suffer from false positives. Finally, the optional Part XVII contains a sampling of the videos from the sequel course (Algorithms: Design and Analysis 2, to be offered later in 2013) --- I hope these get your excited to continue you algorithmic studies!

THE HOMEWORK: Problem Set 6 should solidify your understanding of hash tables and search trees. In the programming project you'll implement two slick data structure applications that we've seen over the past couple of weeks (the 2-SUM problem via hash tables, and median maintenance via heaps).

THE FINAL: The final exam will be given during the window March 11-24. You should take it during that window at a time convenient for you. Details to follow.
Fri 8 Mar 2013 1:29 PM CET

Week 5 Overview

SUMMARY: This week covers two topics. The first topic (Part XI) is the climax of all our work on graph search --- Dijkstra's shortest-path algorithm, surely one of the greatest hits of algorithms. It works in any directed graph with non-negative edge lengths, and it computes the shortest paths from a source vertex to all other vertices. Particularly nice is the blazingly fast implementation that uses a heap data structure. Speaking of data structures, they constitute the second topic of the week (Part XII). This is a huge (and hugely useful) topic, and the sixth and final week will also be devoted to them. This week covers heaps and balanced binary search trees. My primary goals here are to teach you the operations that these data structures support (along with their running times), and to develop your intuition about which data structures are useful for which sorts of problems.

THE VIDEOS: There are four videos on Dijkstra's shortest-path algorithm (Part XI). Three are required (the algorithm, examples, and the running time analysis); the proof of correctness is recommended but optional. Part XII begins with a short overview video in which I explain my approach to teaching data structures in this course. The next two videos discuss heaps and some of their applications (required), and some details about how they are typically implemented (optional, recommended for hardcore
programmer/computer science types). Part XIII discusses balanced binary search trees --- the supported operations and canonical uses (required) and a glimpse at what's "under the hood" (optional).

THE HOMEWORK: Problem Set 5 should solidify your understanding of Dijkstra's shortest-path algorithm and heaps (balanced binary search trees will appear on Problem Set 6). In the programming project you'll implement Dijkstra's algorithm. You can just implement the basic version, but those of you who want a bigger challenge are encouraged to devise a heap-based implementation.
Sat 2 Mar 2013 1:18 PM CET

Week 4 Overview

SUMMARY: Week 4 is all about graph search and its applications (section X). We'll cover a selection of fundamental primitives for reasoning about graphs. One very cool aspect of this material is that all of the algorithms that we'll cover are insanely fast (linear time with small constants); and, it can be quite subtle to understand why they work! The culmination of these lectures --- computing the strongly connected components of a directed graph with just two passes of depth-first search --- vividly illustrates the point that fast algorithms often require deep insight into the structure of the problem that you're solving. (If you're feeling REALLY motivated, you might read up on Tarjan's earlier algorithm for computing SCCs that needs only a single DFS pass!)

THE VIDEOS: We begin with an overview video, which gives some reasons why you should care about graph search, a general strategy for searching a graph without doing any redundant work, and a high-level introduction to the two most important search strategies, breadth-first search (BFS) and depth-first search (DFS). The second video discusses BFS in more detail, including applications to computing shortest paths and the connected components of an undirected graph. The third video drills down on DFS, and shows how to use it to compute a toplogical ordering of a directed acyclic graph (i.e., to sequence tasks in a way that respects precedence constraints). The fourth and fifth videos cover the description and analysis, respectively, of the aforementioned algorithm for computing SCCs. A final optional video --- hopefully one of the more fun ones in the course --- discusses the structure of the Web, and lightly touches on topics like Google's PageRank algorithm and the "six degrees of separation" phenomenon in social networks.

THE HOMEWORK: The problem set (due in roughly 2 weeks) should help you solidify your understanding of graph representations and graph search. The programming assignment asks you to implement the SCC algorithm from the lectures, and report your findings about the SCCs of a large graph. This programming assignment is the most difficult one of the course; as always, I encourage you to use the discussion forums to exchange ideas, tips, and test cases.

OPTIONAL THEORY PROBLEMS: A third batch has been posted.
Mon 18 Feb 2013 6:41 PM CET

Week 3 Overview

VII --- We've posted a second part of the probability review. It's pretty short, just two topics, although quite tricky ones! (Namely, conditional probability and independence.) You need to review this material (via this video or some other source, as you wish) before studying the analysis of the randomized contraction algorithm (see below).

VIII --- Linear-Time Selection (Required). These lectures study the problem of computing the ith smallest element of an input array (e.g., the median). It's easy to solve this problem in O(n log n) time using sorting, but we can do better! The required material goes over a super-practical randomized algorithm, very much in the spirit of QuickSort, that has *linear* expected running time. Don't forget that it takes linear time just to read the input! The analysis is somewhat different than what we studied for QuickSort, but is equally slick. Basically, there's a cool way to think about the progress the algorithm makes in terms of a simple coin-flipping experiment. Linearity of expectation (yes, it's back...) seals the deal.

VIII --- Linear-Time Selection (Optional). I couldn't resist covering some advanced related material. The first is an algorithm that has more Turing Award-winning authors than any other algorithm that I know of. It is a deterministic (i.e., no randomization allowed) linear-time algorithm for the Selection problem, based on an ingenious "median-of-medians" idea for guaranteeing good pivot choices. The second optional topic answers the question "can we do better?" for sorting, unfortunately in the negative. That is, a counting argument shows that there is no "comparison-based" sorting algorithm (like MergeSort, QuickSort, or HeapSort) with worst-case running time better than n log n.

IX --- Graphs and the Contraction Algorithm. The second set of lectures for this week is a segue from randomized algorithms to graph algorithms. We first review graphs and the most standard ways of representing them (most commonly, by adjacency lists). We then consider the random contraction algorithm, discovered by Karger "only" 20 years ago (while a PhD student here at Stanford). This algorithm solves the minimum cut problem --- given an undirected graph, separate the vertices into two non-empty groups to minimize the number of "crossing edges". Such problems come up when reasoning about, for example, physical networks, social networks, and images. This algorithm was perhaps the first strong evidence that graph problems could be added to the long list of "killer applications" of random sampling. Don't tune out before the final plot twist --- a simple but useful trick for transforming an algorithm that almost always fails into one that almost always succeeds.

Problem Set #3 has five questions about the randomized selection algorithm, cuts in graphs, and the contraction algorithm. As usual, it's due in a little less than two weeks, and if you miss the deadline you can turn it in by the end of the course for half credit. Programming Project #3 asks you to implement the contraction algorithm and use it to compute the min cut of the graph that we provide.
Thu 14 Feb 2013 6:25 PM CET

Week 2 Overview

THE MASTER METHOD (Part IV). These lectures cover a "black-box" method for solving recurrences. You can then immediately determine the running time of most of the divide-and-conquer algorithms that you'll ever see! (Including Karatsuba's integer multiplication algorithm and Strassen's matrix multiplication algorithm from Week 1.) The proof is a nice generalization of the recursion tree method that we used to analyze MergeSort. Ever wonder about the mysterious three cases of the Master Method? Watch these videos and hopefully all will become clear.

QUICKSORT - THE ALGORITHM (Part V). One of the greatest algorithms ever, and our first example of a randomized algorithm. These lectures go over the pseudocode --- the high-level approach, how to partition an array around a pivot element in linear time with minimal extra storage, and the ramifications of different pivot choices --- and explain how the algorithm works.

QUICKSORT - THE ANALYSIS (Part VI). These lectures are optional, but I strongly encourage you to watch them if you have time. They prove that randomized QuickSort (i.e., with random pivot choices) runs in O(n log n) time on average. The analysis is as elegant as the algorithm itself, and is based on a "decomposition principle" that is often useful in the analysis of randomized algorithms. Note that there are some accompanying lectures notes for this part; the icon for downloading them is right next to the PDF slides for these lectures.

PROBABILITY REVIEW (Part VII). This optional lecture reviews the concepts from discrete probability that are necessary for the QuickSort analysis --- sample spaces, events, random variables, expectation, and linearity of expectation.

Problem Set #2 has five questions that should give you practice with the Master Method and help you understand QuickSort more deeply. It is due in two weeks (February 17th). If you miss the deadline you can still get 50% credit if you turn it in by the end of the course (March 24th). We'll remember your best score from up to two attempts.

Programming Assignment #2 asks you to implement QuickSort and compute the number of comparisons that it makes for three different pivot rules. The due dates are the same as for the problem set; here, 5 attempts are permitted.
Mon 4 Feb 2013 1:52 PM CET

Week 1 Overview

INTRODUCTION: The first set of lectures for this week is meant to give you the flavor of the course, and hopefully get you excited about it. We begin by discussing algorithms in general and why they're so important, and then use the problem of multiplying two integers to illustrate how algorithmic ingenuity can often improve over more straightforward or naive solutions. We discuss the Merge Sort algorithm in detail, for several reasons: it's a practical and famous algorithm that you should all know; it's a good warm-up to get you ready for more intricate algorithms; and it's the canonical introduction to the "divide and conquer" algorithm design paradigm. These lectures conclude by describing several guiding principles for how we'll analyze algorithms in this course.

ASYMPTOTIC ANALYSIS: The second set of lectures for this week is an introduction to big-oh notation and its relatives. The goal is to identify a "sweet spot" of granularity for reasoning about algorithms --- we want to suppress second-order details like constant factors and lower-order terms, and focus on how the running time of an algorithm scales as the input size grows large. These lectures are on the dry side, but they don't take very long and this vocabulary is essential for talking about algorithms. Note that one of these lectures is optional (watch it only if you'd like to see a couple more examples).

DIVIDE AND CONQUER ALGORITHMS: The final set of lectures for this week discusses three non-trivial examples of the divide and conquer algorithm design paradigm. The first is for counting the number of inversions in an array. This problem is related to measuring similarity between two ranked lists, which in turn is relevant for making good recommendations to someone based on your knowledge of their and others' preferences ("collaborative filtering"). The second algorithm is Strassen's mind-blowing recursive algorithm for matrix multiplication, which improves over the obvious iterative method. The third algorithm, which is more advanced and is optional material, is for computing the closest pair of points in the plane. Such "proximity problems" are fundamental geometric primitives, with tons of applications.

SUPPORTING MATERIAL FOR VIDEOS: Note that videos can be downloaded and watched offline (recommended for commutes, etc.). Also note that, in addition to the videos themselves, there are multiple versions of the lecture slides (in PPT and PDF formats, with my original handwritten annotations and also typed versions), as well as subtitle files (in English and in many cases other languages as well).

HOMEWORK (DUE FEBRUARY 10): The first problem set is out now, and due in roughly two weeks (February 10th). If you miss the deadline you can still get 50% credit if you turn it in by the end of the course (March 24th). You'll normally receive 2 attempts for each problem set (we'll remember your best score), but to warm up with this first assignment you'll get an extra attempt. The problem set consists of 5 problems, mostly about Merge Sort and asymptotic notation.

The first programming assignment is also out, with the same due date as the problem set. Here, we ask you to implement the counting inversions algorithm (see the third set of lectures), run it on a quite large input, and enter the answer. You can attempt each programming assignment up to 5 times.

DISCUSSION FORUMS: The forums play a crucial role in massive online courses like this one. If you have trouble understanding a lecture or completing an assignment, you should turn to the forums for help. After you've mastered the lectures and assignments for a given week, I hope you'll contribute to the forums and help out your fellow students. While I won't have time to carefully monitor the discussion forums, I'll check in and answer questions whenever I find the time.
Mon 28 Jan 2013 4:09 AM CET

Welcome!

Hello everyone!

Welcome to Algorithms: Design & Analysis (Part I). You can access the course material through the tabs on the left panel:

- Syllabus gives an overview of the course schedule. Click on a given week to read about the highlights of that week.
- Video Lectures leads to the lectures for the current and previous weeks.
- Problem Sets leads to the multiple choice problem sets (one each week).
- Programming Questions leads to the programming assignments (one each week).
- Theory Problems has optional theory problem sets (not graded, just for the intellectual challenge).
- Discussion Forums offer you a place to discuss problems and interact with other students
- Course Logistics has the specific grading policies (late policies, requirements for a certificate, etc.).

We strongly encourage you to use the discussion forums for any course problems that you encounter. Please also consider helping your fellow students via these forums --- the stronger the student community, the more successful the course!

On the right panel you can find the upcoming deadlines for new programming assignments/ problem sets as well as the newly uploaded lectures and any upcoming exams.

We hope you enjoy this online course!
Mon 28 Jan 2013 4:07 AM CET