In this video I'll talk about various
aspects of the course, the topics that
we'll cover, the kinds of skills you can
expect to acquire, the kind of background
that I expect, the supporting materials
and the available tools for self
assessment. Let's start with the specific
topics that this course is going to cover.
The course material corresponds to the
first half of the ten week Stanford
course. It's taken by all computer science
undergraduates, as well as many of our
graduate students. There will be five high
level topics, and at times these will
overlap. The five topics are first of all,
the vocabulary for reasoning about
algorithm performance, the design and
conquer algorithm design paradigm,
randomization and algorithm design,
primitives for reasoning about graphs, and
the use and implementation of basic data
structures. The goal is to provide an
introduction to and basic literacy in each
of these topics. Much, much more could be
said about each of them, than we'll have
time for here. The first topic is the
shortest, and probably also the driest.
But it's a prerequisite for thinking
seriously about the design and analysis of
algorithms. The key concept here is big-O
notation, which, conceptually, is a
modeling choice about the granularity with
which we measure a performance metric like
the running time of an algorithm. It turns
out that the sweet spot for clear high
level thinking about algorithm design, is
to ignore constant factors and lower-order
terms. And to concentrate on how well
algorithm performance scales with large
input sizes. Big O notation is the way to
mathematize this sweet spot. Now, there's
no one silver bullet in algorithm design.
No single problem solving method that's
guaranteed to unlock all of the
computational problems that you're likely
to face. That said, there are a few
general algorithm design techniques. High
level approaches to algorithm design that
find successful application across a range
of different domains. These relatively
widely applicable techniques are the
backbone of a general algorithms course
like this one. In this course, we'll only
have time to deeply explore one such
algorithm design paradigm, namely that of
the divide and conquer algorithms. In the
sequel course as we'll discuss, there's
two other major algorithms on paradigms to
get covered. But for now, divide and
conquer algorithm, the idea is to first
break the problem into smaller problems
which then gets solved recursively, and
then to somehow quickly combine the
solutions to the sub problems into one for
the original problem that you actually
care about. So for example, in the last
video. We saw two algorithms of this sort,
two divide and conquer algorithms from
multiplying two large integers. In later
videos we will see a number of different
applications. We'll see how to design fast
divide and conquer algorithms for problems
ranging from sorting to matrix
multiplication to nearest neighbor-type
problems and computation of geometry. In
addition, we'll cover some powerful
methods for reasoning about the running
time of recursive algorithms like these.
As for the third topic. A randomized
algorithm is one that, in some sense,
flips coins while it executes. That is, a
randomized algorithm will actually have
different executions if you run it over
and over again on a fixed input. It turns
out, and this is definitely not intuitive,
that allowing randomization internal to an
algorithm, often leads to simple, elegant,
and practical solution to various
computational problems. The canonical
example is randomized quick sort, and that
algorithm and analysis we will cover in
detail in a few lectures. Randomized
primality testing is another killer
application that we'll touch on. And we'll
also discuss a randomized approach to
graph partitioning. And finally
we'll discuss how randomization is used to
reason about hash functions and hash maps.
One of the themes of this course, and one
of the concrete skills that I hope you
take away from the course, is, literacy
with a number of computational primitives
for operating on data, that are so fast,
that they're, in some sense, essentially
free. That is, the amount of time it take
to invoke one of these computational
primitives is barely more than the amount
of time you're already spending just
examining or reading the input. When you
have a primitive which is so fast, that
the running time is barely more than what
it takes to read the input, you should be
ready to apply it. For example, in a
preprocessing step, whenever it seems like
it might be helpful. It should just be
there on the shelf waiting to be applied
at will. Sorting is one canonical
example of a very fast, almost for-free
primitive of this form. But there are ones
that operate on more complex data as well.
So recall that a graph is a data structure
that has, on the one hand, vertices, and
on the other hand, edges. Which connects
pair of vertices. Graphs model, among any
other things, different types of networks.
So even though graphs are much more
complicated than mere arrays, there's
still a number of blazingly fast primitives
for reasoning about their structure. In
this class we'll focus on primitives for
competing connectivity information and also shortest paths. We'll also touch on how some primitives have been
used to investigate the structure of
information in social networks. Finally,
data structures are often a crucial
ingredient in the design of fast
algorithms. A data structure's responsible
for organizing data in a way that supports
fast queries. Different data structures
support different types of queries. I'll
assume that you're familiar with the
structures that you typically encounter in
a basic programming class including arrays
and vectors. Lists, stacks, and queues.
Hopefully, you've seen at some point both
trees and heaps, or you're willing to read
a bit about them outside of the course,
but we'll also include a brief review of
each of those data structures as we go
along. There's two extremely useful data
structures that we'll discuss in detail.
The first is balanced binary search trees.
These data structures dynamically maintain
an ordering on a set of elements, while
supporting a large number of queries that
run in time logarithmic in the size of
the set. The second data structure we'll
talk a fair bit about is hash tables or
hash maps, which keep track of a dynamic
set, while supporting extremely fast
insert and lookup queries. We'll talk
about some canonical uses of such data
structures, as well as what's going on
under the hood in a typical
implementation of such a data 
structure. >> There's a number of
important concepts in the design and
analysis of algorithms that we won't have
time to cover in this five week course.
Some of these will be covered in the
sequel course, Design and Analysis of
Algorithms II, which corresponds to the
second half of Stanford's ten week course
on this topic. The first part of this
sequel course focuses on two more
algorithm design paradigms. First of
all, the design analysis of greedy
algorithms with applications to minimum
spanning trees, scheduling, and
information theoretic coding. And
secondly, the design analysis of dynamic
programming algorithms with example
applications being in genome sequence
alignment and the shortest path protocols
in communication networks. The second part
of the sequel course concerns NP complete
problems, and what to do about them. Now,
NP complete problems are problems that,
assuming a famous mathematical conjecture
you might have heard of, which is called the
"P not equal to NP" conjecture, are problems
that cannot be solved under this
conjecture by any computationally
efficient algorithm. We'll discuss the
theory of NP completeness, and, with a
focus on what it means for you as an
algorithm designer. We'll also talk about
several ways to approach NP complete
problems, including: fast algorithms that
correctly solve special cases; fast
heuristics with provable performance
guarantees; and exponential time
algorithms that are qualitatively faster
than brute force search. Of course there
are plenty of important topics that can't
be fit into either of these two five-week
courses. Depending on the demand, there
might well be further courses on more
advanced topics. Following this course is
going to involve a fair amount of time and
effort on your part. So it's only
reasonable to ask: What can you hope to get
out of it? What skills will you learn?
Well. Primarily, you know, even though
this isn't a programming class per se, it
should make you a better programmer.
You'll get lots of practice describing and
reasoning about algorithms, you'll learn
algorithm design paradigms, so really high
level problem-solving strategies that are
relevant for many different problems
across different domains, and tools for
predicting the performance of such
algorithms. You'll learn several extremely
fast subroutines for processing data and
several useful data structures for
organizing data that can be deployed
directly in your own programs. Second,
while this is not a math class per se,
we'll wind up doing a fair amount of
mathematical analysis. And this in turn
will sharpen your mathematical analytical
skills. You might ask, why is mathematics
relevant for a class in the design and
analysis of algorithms, seemingly more of
a programming class. Well let me be clear.
I am totally uninterested in merely
telling you facts or regurgitating code
that you can already find on the web or in
any number of good programming books. My
goal here in this class, and the way I
think I can best supplement the resources
that you probably already have access
to is to explain why things are
the way they are. Why we analyze the
algorithms in the way that we do, why
various super fast algorithms are in fact
super fast, and so on. And it turns out
that good algorithmic ideas usually
require nontrivial mathematical analysis
to understand properly. You'll acquire
fundamental insights into the specific
algorithms and data structures that we
discuss in the course. And hopefully, many
of these insights will prove useful, more
generally, in your other work. Third, and
perhaps the most relevant for those of you
who work in some other discipline: this
course should help you learn how to think
algorithmically. Indeed after studying
algorithms it's hard enough not to see
them pretty much everywhere, whether you
are riding an elevator, watching a flock
of birds, buying and selling stocks out of
your portfolio, even watching an infant
learn. As I said in the previous video
algorithm thinking is becoming
increasingly useful and prevalent if you
are outside of computer science and
technology like in biology, statistics and
economics. Fourth, if you're interested in
feeling like a card carrying computer
scientist, in some sense, then you'll
definitely want basic literacy in all of
the topics that we'll be covering. Indeed,
one of the things that makes studying
algorithms so fun, is, it really feels
like you're studying a lot of the greatest
hits from the last 50 years of computer
science. So, after this class, no longer
will you feel excluded at that computer
science cocktail party when someone cracks
a joke about Dijkstra's Algorithm. Now
you'll know exactly what they mean.
Finally, there's no question that studying
this material is helpful for technical
interview questions. To be clear, my sole
goal here is to teach you algorithms, not
to prepare you for interviews, per se. But
over the years, countless students of mine
have regaled me with stories about how
mastering the concepts in this class
enabled them to ace every technical
question they were ever asked. I told you,
this is fundamental stuff. So, what do I
expect from you? Well, honestly, the
answer is nothing. After all isn't the
whole point of a free online class like
this one that anyone can take it and
devote as much effort to it as they like.
So that said, as a teacher it's still
useful to have one or more canonical
students in mind. And I thought I'd go
ahead and be transparent with you about
how I'm thinking about these lectures. Who
I have in mind that I'm teaching to. So
again, please don't feel discouraged if
you don't conform to this canonical
student template. I'm happy to have the
opportunity to teach you about algorithms
no matter who you are. So first, I have in
mind someone who knows at least some
programming. For example, consider the
previous lecture. We talked about a
recursive approach to multiplying two
numbers and I mentioned how in certain
mathematical expression, back then we
labeled it star and circled it in green.
How that expression naturally translated
into a recursive algorithm. In particular,
I was certainly assuming that you had some
familiarity with recursive programs.
If you feel comfortable with my statement
in that lecture, if you feel like you
could code up a recursive integer
multiplication algorithm based on the high
level outline that I gave you, then you
should be in good shape for this course.
You should be good to go. If you weren't
comfortable with that statement, well, you
might not be comfortable with the
relatively high conceptual level at which
we discuss program in this course. But I
encourage to watch the next several videos
anyway, to see if you get enough out of
them to make it worth your while. [sound].
Now, while I'm aiming these lectures at
people who know some programming, I'm not
making any assumptions whatsoever about
exactly which programming languages you know. Any
standard imperative language you know,
something like C, Java or Python, is
totally fine for this course. Now, to make
these lectures accessible to as many
programmers as possible, and to be honest,
you know, also to promote thinking about
programming at a relatively abstract
conceptual level, I won't be describing
algorithms in any particular programming
language. Rather, when I discuss the
algorithms, I'll use only high-level
pseudo-code, or often simply English. My
inductive hypothesis is that you are
capable of translating such a high level
description into a working program in your
favorite programming language. In fact, I
strongly encourage everyone watching these
lectures to do such a translation of all
of the algorithms that we discussed. This
will ensure your comprehension, and
appreciation of them. Indeed, many
professional computer scientists and
programmers don't feel that they really
understand an algorithm until they've
coded it up. Many of the course's
assignments will have a problem in which
we ask you to do precisely this. Put
another way, if you're looking for a sort
of coding cookbook, code that you can copy
and paste directly into your own programs.
Without necessarily understanding how it
works, then this is definitely not the
course for you. There are several books
out there that cater to programmers
looking for such coding cook books.
Second, for these lectures I have in mind
someone who has at least a modest amount
of mathematical experience though perhaps
with a fair bit of accumulated rust.
Concretely I expect you to be able to
recognize a logical argument that is a
proof. In addition, two methods of proof
that I hope you've seen before are proofs
by induction and proofs by contradiction.
I also need you to be familiar with basic
mathematical notation, like the standard
quantifier and summation symbols. A few
of the lectures on randomized algorithms
and hashing will go down much easier for
you if you've seen discrete probability
at some point in your life. But beyond
these basics, the lectures will be self
contained. You don't even need to know any
calculus, save for a single simple
integral that magically pops up in the 
analys of the randomized quick sort
algorithm. I imagine that many of you have
studied math in the past, but you could
use a refresher, you're a bit rusty. And
there's plenty of free resources out there
on the web, and I encourage you to explore
and find some that you like. But one that
I want to particularly recommend is a
great set of free lecture notes. It's
called Mathematics for Computer Science.
It's authored by Eric Lehman and Tom
Layden, and it's quite easy to find on the
web if you just do a web search. And those
notes cover all of the prerequisites that
we'll need, in addition to tons of other
stuff. In the spirit of keeping this
course as widely accessible as possible,
we're keeping the required supporting
materials to an absolute minimum. Lectures
are meant to be self-contained and we'll
always provide you with the lecture notes
in PowerPoint and PDF format. Once in a
while, we'll also provide some additional
lecture notes. No textbook is required for
this class. But that said, most of the
material that we'll study is well covered
in a number of excellent algorithms books
that are out there. So I'll single out
four such books here. The first three I
mention because they all had a significant
influence on the way that I both think
about and teach algorithms. So it's
natural to acknowledge that debt here. One
very cool thing about the second book, the
one by Dasgupta, Papadimitriou and Vazirani, is that the authors
have made a version of it available online
for free. And again, if you search on the
authors' names and the textbook title, you
should have no trouble coming up with it
with a web search. Similarly, that's the
reason I've listed the fourth book because
those authors have likewise made
essentially a complete version of that
book available online and it's a good
match for the material that we're going to
cover here. If you're looking for more
details about something covered in this
class, or simply a different explanation
than the one that I give you, all of these
books are gonna be good resources for you.
There are also a number of excellent
algorithm textbooks that I haven't
put on this list. I encourage to explore
and find you own favorite. >> In our
assignments, we'll sometimes ask you to
code up an algorithm and use it to solve a
concrete problem that is too large to
solve by hand. Now, we don't care what
program and language and development
environment you use to do this as we're
only going to be asking you for the final
answer. Thus, we're not requiring anything
specific, just that you are able to write
and execute programs. If you need help or
advice about how to get set up with a
suitable coding environment, we suggest
that you ask other students for help via
the course discussion forum. Finally,
let's talk a bit more about assessment.
Now this course doesn't have official
grades per se, but we will be assigning
weekly homeworks. Now we're going to
assign homeworks for three different
reasons. The first is just for
self-assessment. It's to give you the
opportunity to test your understanding of
the material so that you can figure out
which topics you've mastered and which
ones that you haven't. The second reason
we do it is to impose some structure on
the course, including deadlines, to
provide you with some additional
motivation to work through all the topics.
Deadlines also have a very important side
effect that synchronizes a lot of the
students in the class. And this of course
makes the course discussion forum a far
more effective tool for students to seek
and provide help in understanding the
course material. The final reason that we
give homeworks is to satisfy those of you
who, on top of learning the course
material, are looking to challenge
yourself intellectually. [sound]. Now,
this class has tens of thousands of
students. So it's obviously essential that
the assignments can be graded
automatically. Now, we're currently only
in the 1.0 generation of free online
courses such as this one. So the available
tools for auto graded assessment are
currently rather primitive. So, we'll do
the best we can, but I have to be honest
with you. It's difficult, or maybe even
impossible to test deep understanding of
the design and analysis of algorithms,
using the current set of tools. Thus,
while the lecture content in this online
course is in no way watered down from the
original Stanford version. The required
assignments and exams we'll give you, are
not as demanding as those that are given
in the on campus version of the course. To
make up for this fact, we'll occasionally
propose optional algorithm design
problems, either in a video or via
supplementary assignment. We don't have
the ability to grade these, but we hope
that you'll find them interesting and
challenging, and that you'll discuss
possible solutions with other students via
the course discussion forum. So I hope
this discussion answered most of the
questions you have about the course. Lets
move on to the real reason that we're all
here, to learn more about algorithms.