This video is a segway between bad news 
and good news. 
The bad news, which we have now 
discussed, is NP completeness. 
The fact that there are computationally 
intractable problems out there in the 
world. 
In fact, they are fairly ubiquitous and 
you're likely to encounter them in your 
own projects. 
The good new is that NP completeness is 
hardly a death sentence. 
Indeed, our algorithmic tool box is now 
rich enough to provide many different 
strategies toward coping with NP complete 
problems. 
So suppose you have identified a 
computational problem on which the 
success of your new startup company 
rests. 
May be you would spend the last several 
weeks throwing the kitchen sink at in. 
All the algorithm design paradigms you 
know, all the data structures, all the 
primitives, nothing works. Finally, you 
decide to try to prove the problem is NP 
complete, and you succeed. 
Now you have an explanation for why your 
weeks of effort have come to naught, but 
that doesn't change the fact that this is 
the problem that governs the success of 
your project. 
What should you do? Well, the good news 
is, NP completeness is certainly not a 
death sentence. 
There are people solving, or at least 
approximately solving, NP complete 
problems all the time. 
However, knowing that your problem is NP 
complete does tell you where to set your 
expectations. 
You should not expect some general 
purpose, super-fast algorithm, like we 
have for other computational problems, 
like say, sorting, or single source 
shortest paths. 
Unless you are dealing with unusually 
small, or well structured inputs, you're 
going to have to work pretty hard to 
solve this problem, and also, possibly, 
make some compromises. 
The rest of this course is devoted to 
strategies for solving or approximately 
solving NP- complete problems. 
In the rest of the video, I'll give you 
an orientation for what those strategies 
are, and what you can expect to come. 
So as usual, I'm going to to focus here 
on general purpose strategies that cut 
across multiple application domains. 
As usual, these general principles should 
just be a starting point. 
You should take them, and run with them, 
augmenting them with whatever domain 
expertise you have, in the specific 
problem that you need to solve. 
The first strategy, is to focus on 
computationally tractable special cases, 
of an np complete problem. 
Related, relatedly you want to think 
about what's special about your domain or 
about the data sets that you're working 
with, and try to understand if there's 
special structure which can be exploited 
in your algorithm. 
Let me point out, we've already done this 
in a couple of cases in this course. 
The first example we saw concerns the 
weighted independent set. 
So we started this problem on path graphs 
but computational problem makes perfect 
sense in general graphs. 
The general problem is I give you as 
input, an undirected graph, every vertex 
has a weight, and I want the maximum 
weight subset of vertices that is an 
independent set. 
And remember, in an independent set, you 
are forbidden to take any 2 vertices that 
are neighbors. 
So in an independent set, none of the 
pairs of vertices that you've picked are 
joined by an edge. 
In general graphs, the way to do an 
independent set problem is NP complete, 
so we certainly don't expect it to have a 
polynomial time algorithm. 
But, in the special case where the graph 
is a path, as we saw, there's a linear 
time, dynamic programming algorithm, that 
exactly solves the weight of the 
independent set problem. 
So path graphs form a special case of the 
weight of the weighted independent set 
problem that's computationally tractable, 
solvable in polynomial, even linear, 
time. 
In fact, the frontier of tractability can 
be pushed well beyond path graphs. 
On the homework, I asked you to think 
through the case of graphs that are 
trees, and notice that you could still do 
dynamic programming efficiently, to be 
weighted independent sets and trees. 
You can even get computationally 
efficient algorithms for a broader class 
of graphs, known as bounded tree width 
graphs. 
So the definition of that is a little 
outside the scope of this course, but you 
can go even beyond trees. 
So the second example follows from my 
dynamic programming algorithm for the 
Knapsack problem, so we discussed that 
running time and we explain why it's 
exponential time. 
The running time of our dynamic 
programming Knapsack algorithm is N, the 
number of items times capital W, the 
Knapsack capacity. 
And because it only takes log W bits to 
specify the capacity capital W, we don't 
call that a polynomial time algorithm. 
But, imagine you only have to solve a 
knapsack instance where the capacity is 
not too big, maybe even say that capacity 
capital W is Big O event, and you 
definitely will see knapsack instances in 
practice, which possess that property. 
Well then our dynamic programming 
algorithm just runs in time, O(n^2), and 
that's a bonafide polynomial time 
algorithm for this special case of a 
small knapsack capacity. 
So next, let me mention a couple of 
examples we're going to see in the 
forthcoming videos. 
The first one is going to concern the 
2-SAT Problem. 
The 2-SAT is a type of constraint 
satisfaction problem. 
You remember, in a constraint 
satisfaction problem, you have a bunch of 
variables, each one gets assigned a 
value. So the simplest case is the 
Boolean case, where each variable can be 
0 or 1, false or true, and then you have 
a bunch of clauses, which specify the 
permitted joint values of a collection of 
variables. 
The 2 in 2-SAT refers to the fact that 
each constraint involves the joint values 
of only a pair of variables. 
So, a canonical constraint in a two set 
instance, is going to for two variables, 
specify three joint assignments that are 
allowed and one that's forbidden. 
So for example may be it will say offer 
variables x3 and x7, it's okay to set 
them both to true, its okay to set them 
both to false, its okay to set 3 to true 
and 7 to false, but it's forbidden to set 
3 to false and 7 to true. 
So that's a canonical constraint in a 
2-SAT instance. 3-SAT, it's the same 
thing, except the constraints involve the 
joint values to a triple of variables, 
and it's going to forbidden one out of 
the eight possibilities. 
Now the 3 set problems are a conanicle NP 
complete problem. 
That was really singled out by Cook and 
Levin as being sufficiently expressive to 
encode all problems in NP. 
But, if each constraints had size only 
two, then, as we'll see, the problem 
becomes polynomial times solvable. 
There's a couple of ways of proving that. 
We're going to talk about a local search 
algorithm that checks whether or not 
there is indeed an assignment to the, 
variables that simultaneously satisfies 
all of the given constraints. 
So the final example, we'll discuss in 
more detail later, but just very briefly, 
we're going to discuss the vertex cover 
problem. 
This is a graph problem, 
and the vertex cover is just a complement 
of an independent set. 
So while an independent set cannot take 
two vertices from the same edge, in the 
vertex cover problem, you have to take at 
least one vertex from, every single edge. 
And then what you want is you want the 
vertex cover that minimizes the sum of 
the vertex weights. 
Yet again, this is an NP complete problem 
in general, but we're going to focus on 
the special case where the optimal 
solution is small. 
That is, we're going to focus on graphs, 
where there's a small number of vertices, 
such that every single edge has at least 
one N point in that small set, and we see 
that for that special case, using a smart 
kind of exhaustive search we'll actually 
be able to solve the problem in 
polynomial time. 
So let me reiterate that these tractable 
special cases are meant primarily to be 
building blocks, upon which you can draw 
in a possibly more sophisticated approach 
to your NP complete problem. 
So just to make this a little more 
concrete, let me just kind of dream up 
one scenario to let you know what I am 
thinking about. 
Imagine, for example, that you have a 
project where unfortunately it's not 
really 2-SAT that you're confronting, 
it's actually a 3-SAT instance. So you're 
feeling kind of bummed, 3-SAT is NP 
complete, and maybe you have 1000 
variables, and certainly you can't do 
brute force search over the 2 to the 
1,000 possible ways of assigning values 
to your 1000 variables. 
But, maybe the good news is, because you 
have domain expertise, because you 
understand this problem instance, you 
know that, yeah, 
there's 1,000 variables, but there's 
really 20 that are crucial. 
You have a feeling, that all of the 
action, basically, is boiling down to how 
these 20 core variables get assigned. 
Well now, maybe you can mix together some 
brute force search with some of these 
tractable special cases. 
For example, you can ennumerate over all 
2 to the 20 ways of assigning values to 
this core set of 20 variables. 
2 to the 20 is roughly a million, that's 
not so bad. 
And now, what you're going to do is, for 
each of these million scenarios, you 
check whether there's a way to extend 
that tentative assignment of 20 values to 
the 20 variables, to the other 980 
variables, so that all of the constraints 
get satisfied. 
The original problem is solvable if and 
only if there exists a way of assigning 
values to these 20 variables that 
successfully extends to the other 980. 
Now, because these are the crucial 
variables and it's where all the action 
is, maybe as soon as you assign all of 
them, 0's and 1's the residual SAT 
instance is tractable. 
For example, maybe it just becomes a 
simple 2-SAT instance, and then you can 
solve it in polynomial time. 
So this gives you a hybrid appoach, 
approach. 
Brute force search at the top level, 
tractable special cases for each guess of 
assignment of the 20 variables, and 
you're off to the races. 
And I hope it's clear, I mean this as 
just one possible way that you might 
combine the various building blocks we're 
developing into a more elaborate approach 
to tackling NP complete problem. 
And that's generally what they take, they 
take a fairly elaborate approach, 
because, after all, they are NP complete, 
you've gotta respect that. 
So with that digression complete, let me 
mention what are the other two strategies 
we're going to be exploring in the 
lectures to come. 
So the second strategy, which is 
certainly one very common in practice, is 
to resort to heuristics. 
That is, two algorithms, which are not 
guaranteed to be correct. 
We haven't really seen examples of 
heuristics in the course thus far. those 
of you that are alumni of part 1, perhaps 
we could classify Carger's randomized 
minimum cut algorithm as a heuristic, 
because it did have a small failure 
probability of failing to find, the 
minimum cut. 
But rather, I'm going to focus on some 
examples in the upcoming lectures. 
I'm going to use the. 
Knapsack problem as a case study, 
and what we'll see, is that our toolbox, 
which contains various algorithm design 
paradigms, it's useful not just for 
designing correct algorithms, but it's 
also useful for designing heuristics. 
So in particular, we'll get a pretty good 
algorithm for the Knapsack problem, using 
the greedy algorithm design paradigm, 
and we'll get an excellent Heuristic for 
Knapsack, using the dynamic programming 
algorithm design pardigm. 
The final strategy, is for situations 
where you are unwilling to relax 
correctness. 
You're unwilling to consider heuristics. 
Now, of course, for an NP complete 
problem, if you're always going to be 
correct, you're not, you don't expect to 
run in polynomial time, but there are 
still opportunities to have algorithms 
that, while exponential time in the worst 
case, are smarter than naive brute force 
search. 
So we have in fact already seen one 
example that can be interpreted as a 
implementation of this strategy, that's 
for the knapsack problem. 
So, in the knapsack problem, naive brute 
force search, would just run over all 
possible subsets of the items. 
It would check if a subset of items fit 
in the knapsack. 
If it does, it remembers the value, and 
then it just returns the feasible 
solution. 
with maximum value. 
That has time proportional to 2 to the n, 
where n is the number of items. 
Our dynamic programming algorithm, has 
running time n times W. 
Now, of course, cap- this is no better 
than 2 to the n, if the knapsack capacity 
is huge, if it is itself, 2 to the n. 
But, as we argued, if W is smaller, this 
algorithm is going to be faster. 
And also, as you learned on the third 
programming assignment, sometimes even 
though W is big, dynamic programming's 
going to beat the crap out of brute force 
search. 
So I'll show you a couple of other 
examples. 
We'll talk about the travelling salesman 
problem, where naive brute force search 
would roughtly take n factorial time, 
where n is the number of vertices. 
We'll give an alternative dynamic 
programming base solution which runs in 
time only 2 to the n, which is much 
better than n factorial. 
The third example which will cover in a 
forthcoming video, we already alluded to 
briefly on the last slide. 
It's for the vertex cover problem. 
So this is where you're given a graph, 
every vertex has a weight, you want the 
minimum weight subset of vertices that 
includes at least one endpoint from every 
edge. 
We're going to consider the version of 
the problem where you want to check 
whether or not it's possible to have a 
vertex cover that uses only k vertices, 
whether or not there exists k vertices 
that includes one endpoint at least, from 
each edge. 
The naive brute force search will run in 
time, end the k, which gets absurd, even 
when k is quite small, 
but we're going to show that there's a 
smarter algorithm, which is still 
exponential in k that runs in time only 2 
to the k times the size of the graph.