1 00:00:00,025 --> 00:00:02,666 Okay discrete optimization or greedy algorithms. 2 00:00:02,666 --> 00:00:05,566 So what we want to do in the next two lectures is basically cover greedy 3 00:00:05,566 --> 00:00:09,240 algorithm which is essentially the first step. 4 00:00:09,240 --> 00:00:12,630 You can always try it when you are trying to solve an optimization problem. 5 00:00:12,630 --> 00:00:14,295 Okay? Very simple, okay? 6 00:00:14,295 --> 00:00:19,050 Heuristic, mostly, no guarantees. But it's the starting point, okay so what 7 00:00:19,050 --> 00:00:22,075 we want to define is what is a greedy algorithm and then you know very 8 00:00:22,075 --> 00:00:25,854 informally, right. And then essentially how to go beyond 9 00:00:25,854 --> 00:00:30,213 greedy and that's the things that, that's essentially what we going to cover, okay? 10 00:00:30,213 --> 00:00:33,300 So, so the, the biggest assumption that we are making in this lecture and the 11 00:00:33,300 --> 00:00:36,142 next one is that its very easy to actually find a feasible solution the 12 00:00:36,142 --> 00:00:41,078 real problem here is optimizing, okay? So, when it's tough to find feasible 13 00:00:41,078 --> 00:00:44,228 solution you know, we, we can use other techniques and, and some of them I'll 14 00:00:44,228 --> 00:00:47,278 discuss in, you know constraint programming or local search to actually 15 00:00:47,278 --> 00:00:53,110 transform that in a, in a problem where the finding the solutions are easy, okay? 16 00:00:53,110 --> 00:00:56,310 But here we are basically assuming that finding a feasible solution is easy, 17 00:00:56,310 --> 00:00:58,685 okay? We focus on the objective function. 18 00:00:58,685 --> 00:01:02,345 Okay, and the key idea behind a greedy algorithm is that your going to look at 19 00:01:02,345 --> 00:01:06,005 the variable, the decision variable one step at a time okay and your going to 20 00:01:06,005 --> 00:01:09,605 assign a value for every one of them and in a sense you choose that value to 21 00:01:09,605 --> 00:01:15,610 minimize the value of the objection function, locally. 22 00:01:15,610 --> 00:01:18,922 Okay at this point your choosing a value such that the objection function increase 23 00:01:18,922 --> 00:01:21,360 the least. If we are minimizing. 24 00:01:21,360 --> 00:01:23,408 Okay? So in a sense, we are looking at the 25 00:01:23,408 --> 00:01:27,160 particular variable, we are trying to assign a value, and minimizing the impact 26 00:01:27,160 --> 00:01:31,696 on the objective function. That's what the greedy algorithm here, 27 00:01:31,696 --> 00:01:34,064 okay? Now called greedy algorithm or heuristic, 28 00:01:34,064 --> 00:01:37,298 you know, essentially they are basically making decision one at a time based on 29 00:01:37,298 --> 00:01:39,480 local information. Okay? 30 00:01:39,480 --> 00:01:42,258 So we're going to illustrate that on the TSP problem. 31 00:01:42,258 --> 00:01:45,407 The TSP problem is one, is probably the most famous combinatorial optimization 32 00:01:45,407 --> 00:01:47,120 problem. Okay? 33 00:01:47,120 --> 00:01:50,110 And the key, you know, the key aspect of the traveling salesman problem is that 34 00:01:50,110 --> 00:01:52,870 you have a bunch of cities that you have to visit and you have to visit them 35 00:01:52,870 --> 00:01:56,661 exactly once, okay? So you have to move from one city to the 36 00:01:56,661 --> 00:01:59,970 next and so on. You visit them exactly once and what you 37 00:01:59,970 --> 00:02:03,750 want to do is minimize the travel distance that you have to travel for you 38 00:02:03,750 --> 00:02:07,650 know, visiting all these, for visiting all these sites or these customers you 39 00:02:07,650 --> 00:02:11,440 can think of. Okay? 40 00:02:11,440 --> 00:02:14,790 So this is a solution to the travelling salesman problem okay, so you start here, 41 00:02:14,790 --> 00:02:18,040 you know you, you visit this, this, this particular city you go the next one and 42 00:02:18,040 --> 00:02:21,490 you follow this very nice path and that's essentially a solution to the travelling 43 00:02:21,490 --> 00:02:26,328 salesman problem. Every city is visiting, is visited 44 00:02:26,328 --> 00:02:29,235 exactly once and then you measure you know the distances that you are 45 00:02:29,235 --> 00:02:32,754 travelling and that's what you are trying to optimize, this is another solution it 46 00:02:32,754 --> 00:02:36,069 looks better or probably shorter, okay And so this is once again, you visit all 47 00:02:36,069 --> 00:02:41,862 of the cities once, okay? And in this particular case it seems that 48 00:02:41,862 --> 00:02:45,007 the distance is smaller. I haven't computed it, but it looks 49 00:02:45,007 --> 00:02:47,940 better. So now, what is the greedy algorithm 50 00:02:47,940 --> 00:02:50,734 going to do? So once again, what we want to do is 51 00:02:50,734 --> 00:02:54,970 build a solution step-by-step. So we start from the vertex and we know 52 00:02:54,970 --> 00:02:58,270 that, that vertex, that you have to go to some place from that part of the vertex, 53 00:02:58,270 --> 00:03:01,115 okay? So that part of the city, okay? 54 00:03:01,115 --> 00:03:02,851 So we take the, you know we take its closest neighbor okay, and that's where 55 00:03:02,851 --> 00:03:06,460 we going to go. So you say okay, we start here and we go 56 00:03:06,460 --> 00:03:11,440 there. That's the closest neighbor, Okay? 57 00:03:11,440 --> 00:03:14,464 Then in the next step we are basically saying, okay so you know what is the we, 58 00:03:14,464 --> 00:03:17,584 we, we, we have moved to this particular position and we ask us that exactly the 59 00:03:17,584 --> 00:03:20,368 same thing where is the next place where I can do, where I can go which is 60 00:03:20,368 --> 00:03:23,344 going to cost me the least, okay, what is the nearest neighbour, okay and so in 61 00:03:23,344 --> 00:03:26,320 this particular case you got to that particular you know you, you go to that 62 00:03:26,320 --> 00:03:29,344 particular location once again, you repeat the process: you say what is the 63 00:03:29,344 --> 00:03:37,828 nearest neighbor where I have to go, ok? In this particular case you can say, "Oh, 64 00:03:37,828 --> 00:03:41,156 but it's this guy." So you can visit this city only exactly once, so you will be 65 00:03:41,156 --> 00:03:45,680 able only to go back to this [UNKNOWN] the first location. 66 00:03:45,680 --> 00:03:48,656 When you have visited everything else okay so what is the nearest neighbor 67 00:03:48,656 --> 00:03:52,120 which is not something that I have already been too okay. 68 00:03:52,120 --> 00:03:55,340 And that brings you to that particular location and then you continue okay. 69 00:03:55,340 --> 00:03:58,817 And you built this beautiful tour here okay once step at a time and when you 70 00:03:58,817 --> 00:04:03,141 arrive to the last location. You just return to your starting point, 71 00:04:03,141 --> 00:04:05,961 okay so that's how you do a greedy algorithm with everyone of the step you 72 00:04:05,961 --> 00:04:08,687 are looking oh, what is the next city that I can visit which costs me the 73 00:04:08,687 --> 00:04:11,554 least, okay and does the idea of the nearest neighbour, now you look at the 74 00:04:11,554 --> 00:04:14,562 students here but this is pretty good, right but you know of course that these 75 00:04:14,562 --> 00:04:17,476 problems are [UNKNOWN] hot so there will be cases were these things are not 76 00:04:17,476 --> 00:04:20,437 going to be good, this is one example, okay same instances but we just changed a 77 00:04:20,437 --> 00:04:23,398 little bit the position of two vertices Okay, and the key idea is that we have 78 00:04:23,398 --> 00:04:26,359 done this maliciously, right, just to make sure the greedy algorithm is not 79 00:04:26,359 --> 00:04:35,050 going to work. What's happened, you know, you start from 80 00:04:35,050 --> 00:04:36,770 the same place, you visit the first thing. 81 00:04:36,770 --> 00:04:39,890 You get to this nice location there and then you take the, you know, the closest 82 00:04:39,890 --> 00:04:43,860 location to nearest neighbor to that particular location. 83 00:04:43,860 --> 00:04:46,040 And it's not this point because it has moved, no? 84 00:04:46,040 --> 00:04:49,253 It's this point, so you go down and you do the same thing at this point, this is 85 00:04:49,253 --> 00:04:54,530 the closest neighbor so you go there, and this guy now is left alone at this point. 86 00:04:54,530 --> 00:04:55,941 OK? So we don't really know when we are going 87 00:04:55,941 --> 00:04:58,570 to visit it. We continue, we think we are doing the 88 00:04:58,570 --> 00:05:01,900 right thing at every point, and then we get to this point. 89 00:05:01,900 --> 00:05:05,480 Usually at this point we would like to go back to the depot but we can't, right? 90 00:05:05,480 --> 00:05:07,210 Because we haven't visited all of the cities. 91 00:05:07,210 --> 00:05:12,335 So we are forced to go back to this long location that we avoided going to before. 92 00:05:12,335 --> 00:05:16,526 And then we have to go back to the depot. Now you see that this is a terrible tour 93 00:05:16,526 --> 00:05:19,335 of your state, because we have many crossings, and you have this 94 00:05:19,335 --> 00:05:23,250 long-distance trip here which I completely avoided. 95 00:05:23,250 --> 00:05:26,490 So this is where these greedy algorithms are breaking. 96 00:05:26,490 --> 00:05:30,460 They're going to work sometimes, and they're not going to work other times. 97 00:05:30,460 --> 00:05:32,680 That the essence of these, these algorithms. 98 00:05:32,680 --> 00:05:35,518 There are many greedy algorithms, this is another one, which actually has some 99 00:05:35,518 --> 00:05:38,313 performance guarantees, okay, not very good one in metric space but they have 100 00:05:38,313 --> 00:05:41,064 some. Okay so the key idea here is that, 101 00:05:41,064 --> 00:05:44,392 instead of moving from one vertex to another one, from one city to the closest 102 00:05:44,392 --> 00:05:47,363 one. Here we are going to build the two with 103 00:05:47,363 --> 00:05:50,639 two cities and then we going to extend it to three cities, and then four cities, 104 00:05:50,639 --> 00:05:54,503 and then five cities and so on and so forth, okay? 105 00:05:54,503 --> 00:05:58,247 So we basically look at this particular, this particular two with two cities here, 106 00:05:58,247 --> 00:06:01,627 okay and now we select another city lets say this one, okay and we are basically 107 00:06:01,627 --> 00:06:05,111 asking you asking ourselves okay what is the easiest way to actually insert that 108 00:06:05,111 --> 00:06:10,350 particular city, okay? And the basic idea here is that we have 109 00:06:10,350 --> 00:06:13,678 to remove this edge and insert to other edges, okay and now we have a two wishes 110 00:06:13,678 --> 00:06:19,138 three cities and what we have to do next is try to insert another city, okay? 111 00:06:19,138 --> 00:06:22,258 So we are going to consider this city and we are going to say oh, you know what is 112 00:06:22,258 --> 00:06:25,770 the best way to insert that particular city. 113 00:06:25,770 --> 00:06:30,060 And by best way, what I mean is that the cheapest way to insert that city, okay? 114 00:06:30,060 --> 00:06:33,430 Now you may say, oh, but that seems to be very complicated, but it's not right? 115 00:06:33,430 --> 00:06:36,673 So there are only so many place where you can actually insert this, this particular 116 00:06:36,673 --> 00:06:39,326 city, right? So you can say oh, I'm going to go from 117 00:06:39,326 --> 00:06:42,020 here and then back there and then go back here. 118 00:06:42,020 --> 00:06:43,800 Or you can insert in between of these two. 119 00:06:43,800 --> 00:06:46,660 You can do this, and then go there or back there, okay? 120 00:06:46,660 --> 00:06:50,056 Or you can go actually from here. To this, and then go there, and then you 121 00:06:50,056 --> 00:06:52,894 have a tour of four cities, and this is actually a better tour in this particular 122 00:06:52,894 --> 00:06:55,946 place, okay. So what you are trying to look at every 123 00:06:55,946 --> 00:06:58,890 step is, what is the best insertion point for a city which is not inside my tour 124 00:06:58,890 --> 00:07:02,288 again, okay. If you see this one, for instance, it's 125 00:07:02,288 --> 00:07:05,430 going to be doing this, right. So when you see this guy, he's probably 126 00:07:05,430 --> 00:07:08,350 going to be doing this, this, and that. This guy's going to be doing this and 127 00:07:08,350 --> 00:07:10,870 this. And so you basically build your tour 128 00:07:10,870 --> 00:07:14,134 starting from, you know a tour of two cities, three cities, four cities and at 129 00:07:14,134 --> 00:07:17,398 every point you have a greedy algorithm which insert the next city as best as, 130 00:07:17,398 --> 00:07:22,201 you know, you know in the cheapest possible way, okay? 131 00:07:22,201 --> 00:07:24,465 So that's to know the greedy algorithm, okay? 132 00:07:24,465 --> 00:07:27,645 You going to tell me oh, yeah, yeah but which one I have to use, okay? 133 00:07:27,645 --> 00:07:30,570 Well, you know it's not clear which one i have to use on particular input but one 134 00:07:30,570 --> 00:07:33,360 of the things that you have noticed here is that these algorithm are easy to 135 00:07:33,360 --> 00:07:37,800 actually you know they are, they, they are very fast to run. 136 00:07:37,800 --> 00:07:41,270 So you could actually use several of these and take the best solutions. 137 00:07:41,270 --> 00:07:44,126 And that's, you know that gives you a particular, you know the first 138 00:07:44,126 --> 00:07:47,430 approximation to your particular problem, okay? 139 00:07:47,430 --> 00:07:50,406 So, in a sense, summarizing, there are many greedy algorithms that you can 140 00:07:50,406 --> 00:07:53,670 conceive, even for something like the traveling salesman problem, and I've only 141 00:07:53,670 --> 00:07:57,962 shown you, you know two of them, but there are many more, okay? 142 00:07:57,962 --> 00:08:01,082 Some will do better than others on all inputs, or some, you know, will do better 143 00:08:01,082 --> 00:08:03,830 on some inputs and not on other ones. OK? 144 00:08:03,830 --> 00:08:07,136 It really depends on the partial input, as we have just shown you on one 145 00:08:07,136 --> 00:08:09,590 particular instance. Okay? 146 00:08:09,590 --> 00:08:12,230 The advantage of greedy algorithm, typically quick to design, quick to 147 00:08:12,230 --> 00:08:16,058 implement. You know, they are fast, they typically 148 00:08:16,058 --> 00:08:20,758 are fast when you run them. Okay, now the problem is that typically 149 00:08:20,758 --> 00:08:23,971 you have no guarantees, you know, the insertion method that I have shown you 150 00:08:23,971 --> 00:08:28,396 has guarantee in the case of, you know Euclidean space, okay? 151 00:08:28,396 --> 00:08:31,372 But in general you probably don't have any guarantee, okay? 152 00:08:31,372 --> 00:08:34,316 So, there are some cases where these greedy algorithms have some guarantees 153 00:08:34,316 --> 00:08:37,536 but not always okay, then the quality may vary tremendously depending on different 154 00:08:37,536 --> 00:08:41,184 types of input, okay? Once again, you know you don't really 155 00:08:41,184 --> 00:08:44,971 know, we can't prove things really. It's a starting point, okay? 156 00:08:44,971 --> 00:08:48,990 And of course, we have assumed that it is easy to find a feasible solution. 157 00:08:48,990 --> 00:08:52,040 If it is not easy to find a feasable solution, one of the things that you can 158 00:08:52,040 --> 00:08:55,190 do is to put some of the constraints inside the objective, and then it is easy 159 00:08:55,190 --> 00:08:59,342 to find solution. Then you optimize this function, and try 160 00:08:59,342 --> 00:09:02,450 to get as close as possible to a feasible solution, okay? 161 00:09:02,450 --> 00:09:05,420 So the the greedy algorithm there would tell you, okay, so this is a solution 162 00:09:05,420 --> 00:09:08,390 which is as close as possible to the, you know, to a feasible solution using this 163 00:09:08,390 --> 00:09:12,362 greedy algorithm, okay? So, the essence of the class is that you 164 00:09:12,362 --> 00:09:14,980 can always start with a greedy algorithm, okay? 165 00:09:14,980 --> 00:09:17,440 It gives you a starting point that you can improve afterwards. 166 00:09:17,440 --> 00:09:20,278 And then you can look at the techniques that are presented in this class you know 167 00:09:20,278 --> 00:09:22,858 constraint programming dynamic programming you know local search and 168 00:09:22,858 --> 00:09:25,438 mixed integer programming to try and improve the value its giving you by 169 00:09:25,438 --> 00:09:29,280 degree. So really the class is about how to do 170 00:09:29,280 --> 00:09:32,038 better than a simple greedy algorithm, okay? 171 00:09:32,038 --> 00:09:35,251 But a greedy algorithm is going to give you a certain starting points and you'd 172 00:09:35,251 --> 00:09:38,566 starting on this tending what the problem is really about and sometimes these 173 00:09:38,566 --> 00:09:41,830 greedy algorithm can be transformed easily okay in constraint programming 174 00:09:41,830 --> 00:09:46,329 solutions, okay? There is one step that you can apply and 175 00:09:46,329 --> 00:09:51,180 then it moves to a you know particular constraint programming solution, okay? 176 00:09:51,180 --> 00:09:53,925 So, you know inside in the class what you learn is finding you know, ways of 177 00:09:53,925 --> 00:09:57,530 finding high quality solution. Feasible solution and finding high 178 00:09:57,530 --> 00:10:00,280 quality solution which are robust across many inputs. 179 00:10:00,280 --> 00:10:02,610 That's what optimization is going to bring to you. 180 00:10:02,610 --> 00:10:05,860 And obviously we have techniques where we can actually give you guarantees. 181 00:10:05,860 --> 00:10:09,136 You can come back and tell you know and tell you know oh, I have a solution and I 182 00:10:09,136 --> 00:10:12,412 know that it's not worse that let's say you know 2% of the optimal value that I 183 00:10:12,412 --> 00:10:16,793 could ever find. Okay so in a sense what the class is 184 00:10:16,793 --> 00:10:20,159 covering you is how to find you know, you know feasible solution in complex when 185 00:10:20,159 --> 00:10:24,628 there are complex constraints. Find high quality solution which are 186 00:10:24,628 --> 00:10:27,790 robust across a large spectrum of instances and then give you performance 187 00:10:27,790 --> 00:10:32,287 guarantee, quality guarantees on the algorithm that you have, okay? 188 00:10:32,287 --> 00:10:35,301 So this is greedy one you know come back to see greedy two. 189 00:10:35,301 --> 00:10:38,412 We'll, we'll, we'll go into a little bit more details on how you move from a 190 00:10:38,412 --> 00:10:41,890 greedy algorithm to other kinds of techniques, okay? 191 00:10:41,890 --> 00:10:42,650 Thank you.