Having laid the groundwork in the previous video, we're now ready to follow our usual dynamic programming recipe to develop a faster than brute-force search dynamic programming algorithm for the traveling Salesman problem. Editing through all those quizzes in the last video, we developed some healthy respect for the traveling salesman problem. We understood why the more natural approaches, which were sufficient for the path problem. For example, the Bellman Ford Algorithm are not good enough to solve TSP. Now no surprise, TSP is an NP complete problem, so you're sort of ready for this, but we know understand it in a pretty. Concrete way. In the single source, shortest path problem, in a sub-problem, all you need to know is where's the path ending and how long could the path be. That's not enough for the traveling salesman problem because there's this extra tricky constraint. But at the end of the day, we want to visit every vertex exactly once. So, in particular, to make sure we don't have repeat vertices show up in our sub-problem solutions. We're going to need to remember not just where smaller sub-problems end. But also the identities of all of the intermediate vertices in the path. Here, then, is the final definition of the sub-problems that we're going to use. We still want to keep track of where this path ends so we still have our destination j, and rather than just specifying a number of vertices that you visit on this path, we specify exactly which vertices get visited. So that's going to be a subset S. A subset of 1 through n. And, of course, it better contain the initial vertex 1, and the final vertex j. And so, for a given choice of j, and a given set choice of capital s. We define l sub s, j, as the length of a shortest path. It starts at vertex 1, end at vertex j. Visits precisely the vertices in capital s, exactly once each. Now, I can understand if your eyes widen a little bit when you see this definition. After all, there's an awful lot of choices for this said capital s, an exponential number. So you'd be right to wonder whether this approach could possibly give any saving whatsoever. Overt naive brute-force search. One thing worth pointing out, and in some sense the source of the computational savings, is that in a given sub problem while it's true we're keeping track of which vertices are visited en route from 1 to j, we're not keeping track Of the order in which those vertices get visited, and indeed, if we had 1 sub-problem for every order in which you might visit vertices from a given source to a given destination, then we'd be stuck with a factorial number of sub-problems. As it is, since we forget the order, and we only focus on sub-sets, that gets us down in more than 2^n range, and that's ultimately. Where the savings over brute-force search come from. So as usual once we've gotten the right set of sub-problems the rest of the dynamic programming recipe pretty much writes itself. But to fill in the details let me just tell you exactly what the optimal substructure limit is, what's the corresponding recurrence and the final pseudocode that gives us our better than brute-force search. Algorithm for the traveling salesman problem. So, for the optimal substructure lemma, we just focus on 1 particular subproblem. So that corresponds to a choice of the destination J. And a choice of the set s. It specifies 1, the starting point. J, the ending point. Plus the rest of the intermediate vertices that we've got to visit along the way, exactly once each. So now we proceed exactly as we did before in [UNKNOWN], namely by focusing on the last hop of this path P. So if the last hop is say from vertex K to vertex J, then we can look at the previous part of the path, call it P prime, what can we say about it? Well clearly it begins at the vertex 1. [INAUDIBLE]. Clearly, it ends at the vertex k. Because of the path p, it visited every vertex in s exactly once. The path, p prime visits every vertex in the set s - j exactly once. And, of course, the optimal substructure lemma says. Not only is prime [INAUDIBLE]. Path. From 1, to k, visiting exactly the vertices of s minus j, once each. It is the shortest, such, path. The proof is an utterly standard, cut and paste argument, like we've been doing for all our other optimal substructure lemmas. I'll leave it as an exercise for you to fill in the details. Tells, as usual the optimal substructure lemma naturally induces a recurrence. recurrence just says well let's look at all of the candidates when [UNKNOWN] solution identified by the lemma and the boot force search among other candidates. That is, for a given sub-problem indexed by a destination j and a set of vertices s, what the recurrence does is it takes the best case scenario, that is the best choice of the second to last vertex k. Of course, k should lie somewhere in the set S, also it should not be the vertex j. And for a given choice of k, we know the corresponding candidate solution has to be a shortest path from 1 to k. Visiting exactly the vertices of s-j. Combined with the cost of the corresponding final hop from k to j. And effectively, we're using the same measure of sub-problem size that we were in the Bellman - Ford solution, just the number of allowable intermediate vertices. Of course, here, the sub-problem also specifies exactly which those vertices are, but as far as a measure of size, we just use the cardinality of that set. As always, the important property of the size measure is that to solve a sub-problem of a given size, you only need to know the answers of sub-problems with strictly smaller size, and staring at the recurrence, you see that that is indeed the case here. You only care about sub-problems that visit 1 fewer vertex, because in particular, j is excluded. The dynamic programming algorithm writes itself. We have a 2 dimensional array. 2 dimensions, 'cuz he have 2 indices, for each sub problem. One specifying the final vertex, j. The other specifying the set capital s. It turns out we can get away with a pretty small set of base cases. We're only going to need to pre-compute the entries of the 2D array, in which the final destination, j=1. It's the same as the started vertex.Now if S happens to contain just the vertex 1, then the empty path goes from 1 to 1 with no intermediate stops that has length 0. Otherwise, if S contains extra vertices there's no way to go from 1 back to 1 with intermediate stops. Visiting the vertex 1 only once. So we define the other set problems with j=1 want to have plus infinity value. Now as usual we saw what the set problem systematically using the recurrence. We want to make sure that rather smaller set problems are solved before the bigger ones, so 1 out of 4 loops we iterate over the measure of set problem size which member of the carnality of the set s. Amongst sub-problems of this same size, we don't care what order we solve them. So we just iterate through the sets S of [UNKNOWN] M in arbitrary order. So these outer two for loops accomplish the following, they iterate through the relevant choices of S. That is sets S to have cardinality at least 2 and contain the vertex of 1. So that's the first index of our 2D array. What about the 2nd index J, so this is where the shortest path in a given sub. The problem is supposed to stop.Well sub problems only make sense, are only defined if that final vertex J, is one of the vertices S, that you are supposed to visit.So when we get to a given choice of capital S, at that point we are going to iterate through all the relevant choices of j.and if you recall our argument in the base case and the fact the this SS has size at east 2, there is no point in trying j=1. So now that we have a subproblem. We have our choice of S, we have our choice of J. We just compute the recurrence for this particular subproblem. So what is that? Well, we take a best case choice of the penultimate vertex. So that's some vertex that we've got to a visit. A vertex k and capital S. Other than the one where we stopped, so it can't be equal to j. And then, given a choice of k, what is the corresponding solution of that candidate path from 1 to j. Well, it's whatever the shortest path is from 1 to k. Which of course is visiting everything in s, except for j. Plus whatever the cost of the final hop is from K to J, so the recurrence just figures out what is the optimal choice of K by brute-force search and then by our optimal substructure lemma we know that's the correct answer to this sub problem. Now in almost all of our dynamic programming algorithms, after we solved for the sub problems, all we did was return the value of the biggest one. Here we actually have to do a tiny bit of extra work. It is not the case that the solution we care about. The minimum cost traveling salesman tour is literally one of the sub problems. However, it is true that we can easily compute the minimum cost of a traveling salesman tour given solutions to all of the sub problems. So what is the biggest sub problem? Well there's actually n of them. Remember our measure of sub problem size is the cardinality of the set S, the number of vertices that you've got to visit in that In that sub problem. So in the biggest sub problems S is equal to everything, you have to compute a shortest path that visits each of the N vertices exactly once. So 1 are the semantics of 1 of those sub problems, for a given choice of the 2nd index J, there's N different choices for that final vertex J. That sub problem was responsible for computing the length of the shortest path. Its starts at the vertex 1 it concludes at the vertex J, and it visits every single vertex exactly once. Now hows is this different than a travel salesman tour? Well all thats missing is that finally hop, that hop from J to 1. So what this means is that, after we've solved all of the subproblems in our triple 4 loop. We can complete the algorithm with 1 final brute-force search. This final time that brute-force search is over the choice of J. The last vertex that you visit before you return home to vertex #1. So you can skip the choice of J, for 1, that 1 doesn't make sense. But for the other N-1 possibilities for that final vertex J, you take the previously computed value of the shortest path, it starts at 1 and goes to J. Exactly once, you tack on to that the cost of the final hop, going home to 1 from vertex j, and of those n-1 different solutions, you return the best of 'em. That is in fact the cost of the, minimum cost travelling salesman tour. The correctness argument is the same as it always is. The correctness of the optimal substructure lemma Implies the correctness of the recurrence and then the correctness of the dynamic programming algorithm follows from the correctness of the recurrence plus induction. The running time is equally straight forward. So because of the sub problems we're keeping track of a lot of information. Specifically the identities of all the intermediate vertices. There are a lot of sub problems. So in particular there's roughly 2^n choices for the set capital S, there's roughly n choices, for the final destination j. So combining those, we get n*2^n sub problems. How much work do you do per sub-problem? Well you have to execute this brute-force search over the chase, choices of the vertex k. The second to last vertex on the path, k can be almost anything in the set capital S, capital S could have linear size. So you're going to do o(n) work per sub-problem. Thus, the overall amount of work, as promised is O of n^2*2n^2. This is still exponential time, there's still sever limits on how big a value of n you're going to be able to execute this algorithm for. That said, the problem is mp complete and you have to respect its computational intractability. And again, at least this shows there are often interesting algorithmic opportu, opportunities to beat brute-force search, even for mp complete problems like the traffic salesman.