Let's move on to the second sense in 
which the generic local search algorithm 
is under-determined. 
The main y loop reads, if there is a 
superior neighboring solution Y, then 
reset the current solution to be y. 
But there may of course be many 
legitimate choices for y. 
For example, in the maximum cut problem, 
given a cut, there may be many vertices 
whose Switch to the opposite group yields 
a cut with strictley more crossing edges. 
So when you have multiple superior 
neighboring solutions, which one should 
you choose. 
Well this is a tricky question and the 
existing theory does not give us a clear 
cut answer. 
Indeed it's seems like the right answer 
is highly domain dependent. 
Answering the question is probably going 
to require some serious experimentation 
on your part. 
With the data that you're interested in. 
One general point I can make is that 
recall when you're using local search to 
generate from scratch an approximate 
solution to an approximation problem you 
want to inject randomness into the 
algorithm to coax it to explore the 
solution space and return as many 
different locally optimal solutions as 
possible over a bunch of independent 
runs. 
You're going to remember the best of 
those locally optimal solutions. 
Is going to return, remember, the best 
one at the end of the day. 
So this is a second opportunity, in 
addition to the starting point, where you 
can inject randomness into local search. 
If you have many possible improving 
values of y, choose 1 At random. 
Alternatively you could try to be clever 
about which y they should choose. 
For example, if there's multiple superior 
neighboring y's, you could go to the best 
one. 
For example, in the maximum cut problem, 
if there's lots of different vertices, 
who've switched to either group, will 
yield a superior cut Pick the vertex 
which increases the number of crossing 
edges by the largest amount. 
In the traveling salesman problem, 
amongst all neighboring tours with 
smaller costs, pick the one with the 
Minimum over all costs. 
So this is a perfectly sensible rule 
about how to choose y. 
It is myopic, and you could imagine doing 
much more sophisticated things to decided 
which y to go to next. 
And indeed, if this is a problem you 
really care about, you might want to work 
hard to figure out what are the right 
heuristics for choosing y, that work well 
in your application. 
A third question that you have to answer, 
in order to have a precisely defined 
local search algorithm, is what are the 
neighborhoods. 
For many problems, there is significant 
flexibility in how to define the 
neighborhoods. 
And theory again does not give clear-cut 
answers about how they should be 
designed. 
Once again this seems like a highly 
domain dependent issue. 
If you want to use local search on a 
problem of interest It's probably going 
to be up to you to empirically explore 
which neighborhood choices seem to lead 
the best performance of the local search 
algorithm. 
One issue is likely that you'll have to 
grapple with is figuring out how big to 
make your neighborhoods. 
To illustrate this point, let's return to 
the maximum cut problem. 
So there we defined the neighbors of a 
cut to be the other cuts you can reach by 
taking a single vertex and Moving into 
the other group. 
This means each cut has a linear number, 
o of n, neighboring cuts, but it's easy 
to make the neighborhood bigger if we 
want. 
For eaxmple, we could define the 
neighbors of a cut to be those cuts you 
can reach by taking you, 2 vertices or 
fewer, and switching them to the opposite 
groups. 
Now each cut is going to have a quadratic 
number of neighbors. 
More generally of course, we could allow 
what single local move to do k swaps of 
vertices between the 2 sides, and then 
the neighbor size would be o(n)^k. 
So, what are the pros and cons of 
enlarging your neighborhood size. 
We generally speaking, the bigger you 
make your neighborhoods, the more time 
you're going to have to invest searching 
for in improving solution in each step. 
For example, in the maximum cut problem 
if we implement things in a straight 
forward way If we only allow 1 vertex to 
be switched in an iteration then we only 
have to search through a linear number of 
options to figure out if there's an 
improving solution or not. 
On the other hand, if we allow 2 vertices 
to be switched in a given iteration we 
have to search through a quadratic number 
of possibilities whether or not we're 
currently locally optimal. 
So bigger the neighborhood generally 
speaking the longer its going to take to 
check whether or not you're currently 
locally optimal or whether there's some 
better solution that you're supposed to 
move on to. 
The good news about enlarging your 
neighborhood size is that you're going to 
have only fewer local optima. 
And in general, some of these local 
optima that you're pruning are going to 
be bad solutions that you're happy to be 
rid of. 
If you look back at the example I gave 
you in the previous video that 
demonstrated that the simple local search 
for maximum cut can be off by 50%. 
You'll see that if we just enlarge the 
neighborhoods to allow 2 vertices to be 
swapped in the same iteration, then all 
of a sudden on that 4 vertex example, 
local search would be guaranteed to 
produced the globally maximum cut. 
The bad locally optimal cuts have been 
pruned away. 
Now, even if you allow 2 vertices to be 
swapped in a given iteration, there's 
going to be more elaborate examples 
showing the local search might be off by 
50%, but on many instances allowing this 
larger neighborhood will give you better 
performance from local search. 
Summarizing one high level design 
decision you should be clear on in your 
head before you apply the local search 
heuristic design paradigm to a 
computational problem is how much do you 
care about solution quality versus how 
much do you care About the computational 
resources required. 
If you care about solution quality a lot 
and you're willing to either wait or 
you're willing to throw a lot of hardware 
at the problem, that suggests using 
bigger neighborhoods. 
They're slower to search but you're 
likely to get a better solution at the 
end of the day. 
If you really want something quick and 
dirty, you want it to be fast, you don't 
care that much about the solution 
quality, that suggests using simpler, 
smaller neighborhoods. 
Neighborhoods that are fast to search, 
knowing that some of the local optima you 
might get might not be very good. 
Let me reiterate, these are just 
guidelines, these are not gospel. 
In all computational problems, but 
especially with local search, the way you 
proceed has to be guided by the 
particulars of your application. 
So make sure you code up a bunch of 
different approaches, see what works. 
Go with that. 
For our final two questions let's 
supposed you've resolved the initial 
three and you have a fully specified 
local search algorithm. 
So you've made a decision about exactly 
what your neighborhoods are, you figured 
out the sweet spot for you between 
efficient searchability and the solution 
quality you're likely to get at the end 
of the day. 
You've made a decision about exactly how 
you're going to generate the initial 
solution. 
And you've made a decision about how, 
when you have multiple neighboring 
superior solutions, which one you're 
going to go to next. 
Now, lets talk about what kind of 
performance guarantees you can expect, 
from a local search algorithm. 
So lets first just talk about running 
time, and lets begin with the modest 
question, is it at least the case, that 
this local search algorithm is guaranteed 
to converge Eventually. 
In many of the scenarios you are likely 
to come across, the answer is yes. 
Here's what I mean. 
Suppose you're dealing with a 
computational problem where the set of 
possible solutions is finite. 
And moreover, your local search is 
governed by an objective function. 
And the way you define a superior 
neighboring solution, is that it's one 
with better objective function value. 
This is exactly what we were doing in the 
maximum cut problem. 
There the space was finite, it was just 
the set of exponentially many graph Cuts 
and our objective function which is the 
number of crossing cuts. 
Similarly for the traveling salesman 
problem, the space is finite. 
It's just the roughly infactorial 
possible tours, and again, how do you 
decide which tour to go to next? You look 
one that decreases the objective function 
value of the tour. 
Total cost of the tour. 
Whenever you have those 2 properties, 
finiteness and strict improvement in the 
objective function, local search is 
guaranteed to terminate. 
You can't cycle, because every time you 
iterate you get something with a strictly 
better objective function value, and you 
can't go on forever, eventually you'll 
run out of the finitely many possible 
things to try. 
There is, of course, no reason to be 
impressed by the finite conversions of 
local search. 
After all, brute force search, equally 
well terminates in finite time. 
So this class is all about having 
efficient algorithms that run quickly. 
So the real question you want to ask is, 
local search guaranteed to converge, in 
say, polynomial Here, the answer is 
generally negative. 
When we studied the unweighted maximum 
cut problem, that was the exception that 
proves the rule. 
That, there, we only needed a quadratic 
number of iterations before finding a 
locally optimal solution, but, as we 
mentioned in passing, even if you just 
pass to the weighted version of. 
Of the maximum cut problem. 
There already, a local search might need, 
in the worst case, an exponential of 
iterations, before halting with a locally 
optimal solution. 
In practice, however, the situation is 
rather different, with local search 
heuristics often, finding, locally 
optimal solutions, quite quickly. 
We do not at present have a very 
satisfactory theory that explains, or 
predicts, the running time of local 
search algorithms. 
If you want to read more about it, you 
might search for the keyword smooth 
analysis. 
Another nice feature of local search 
algorithms, is that even if you're in the 
unlucky case, where your algorithms are 
going to take a long. 
In time, before it finds a locally 
optimal solution, you can always just 
stop it early. 
So when you start the search, you can 
just stay look, if after 24 hours you 
haven't found for me an local optimal 
solution, just tell me the best solution 
you've found thus far. 
So in addition to running time, we want 
to measure the performance of a local 
search algorithm in terms of its solution 
quality. 
So, is that going to be any good? So here 
the answer is definitely no. 
And again, the maximum cut problem was 
the exception that proves the rule. 
That's a very unusual problem that you 
can prove at least some kind of 
performance guarantee about the local, 
locally optimal solutions. 
For most of the optimization problems in 
which you might be tempted to apply the 
local search design paradigm, there will 
be locally optimal solutions quite far 
from globally optimal ones. 
Moreover this is not just a theoretical 
pathology. 
Local search algorithms in practice will 
sometimes generate extrememly lousy 
locally optimal solutions. 
So this ties back into a point we made 
earlier, which is if you're using local 
search not as a just post processing 
improving step, but actually to generate 
from scratch a hopefully near optimal 
solution to an optimization problem, you 
don't just want to run it once, because 
you don't know what you're going to get. 
Rather you want to run it many times, 
making random decisions along the way, 
either from a random starting point, or 
choosing random improving solutions to 
move to, so that you explore as best you 
can the space of all solutions. 
You want over many executions of local 
search to get your hands on as many 
different locally optimal solutions as 
you can, they you can remember the best 
one. 
Hopefully the best one at least will be 
pretty close to a globally optimal 
solution.