1 00:00:00,360 --> 00:00:04,848 This is supplemental material on the knapsack assignment for discrete 2 00:00:04,848 --> 00:00:08,109 optimization. And so, you remember the knapsack, right? 3 00:00:08,109 --> 00:00:11,229 So you had all these beautiful, very, very valuable artifacts, and you wanted 4 00:00:11,229 --> 00:00:14,253 to pick out the subset of them that you could put in your knapsack and run away 5 00:00:14,253 --> 00:00:18,845 as quickly as possible. obviously, you know, you want the highest 6 00:00:18,845 --> 00:00:21,910 value, and, but you also have the knapsack capacity. 7 00:00:21,910 --> 00:00:24,980 That's the first assignment that you will have to solve in this class, okay? 8 00:00:24,980 --> 00:00:29,250 And this is a formalization, very much the same thing as we saw in the lecture. 9 00:00:29,250 --> 00:00:32,448 So, you have basically, an item. For every one of these items, you have a 10 00:00:32,448 --> 00:00:35,350 value that's how much the, you know, the item is worth. 11 00:00:35,350 --> 00:00:37,640 You have a weight, which is, you know, the weight of the item. 12 00:00:37,640 --> 00:00:40,492 And then obviously the decision variables are going to be for every one of the 13 00:00:40,492 --> 00:00:44,245 items, whether you take it or not. Whether you put it in a knapsack or not. 14 00:00:44,245 --> 00:00:46,470 Okay? So it's a maximization problem, you 15 00:00:46,470 --> 00:00:49,740 maximize the value of the items that you will select. 16 00:00:49,740 --> 00:00:53,140 The sum of these, you know, of the weight of these items cannot exceed the capacity 17 00:00:53,140 --> 00:00:56,316 of the knapsack. And for every one of the items, you have 18 00:00:56,316 --> 00:00:58,370 to decide whether you take it or not. Okay? 19 00:00:58,370 --> 00:01:00,040 Very simple. Right? 20 00:01:00,040 --> 00:01:02,980 Now obviously when you look at these things you know that any particular 21 00:01:02,980 --> 00:01:06,620 instance of that problem will have to provide a set of input. 22 00:01:06,620 --> 00:01:08,222 Okay? Obviously we have to know how many items, 23 00:01:08,222 --> 00:01:10,840 okay. We have to know what is the capacity of 24 00:01:10,840 --> 00:01:13,450 the knapsack. These are two constants. 25 00:01:13,450 --> 00:01:17,680 So you see these two constants already. Okay, here in the input data, okay. 26 00:01:17,680 --> 00:01:20,580 So we have the number of items and the capacity of the knapsack. 27 00:01:20,580 --> 00:01:25,305 Okay, n and K are the first two things that you'll find in the input data file. 28 00:01:25,305 --> 00:01:27,636 Okay. And then afterwards now you know how many 29 00:01:27,636 --> 00:01:30,718 items there are, there are. So you would see a long list for, you 30 00:01:30,718 --> 00:01:34,765 know, for every one of these items. And for every one of them there will be, 31 00:01:34,765 --> 00:01:37,530 you know, two information that you need to give, okay. 32 00:01:37,530 --> 00:01:40,641 The value of the item, which is going to be the first value there, and then the 33 00:01:40,641 --> 00:01:43,530 second one is going to be the weight, okay. 34 00:01:43,530 --> 00:01:46,519 So your input file okay, once again, is going to tell you how many items there 35 00:01:46,519 --> 00:01:49,132 are. What is the capacity of the knapsack, and 36 00:01:49,132 --> 00:01:52,108 then a long list and for every one of the items okay, there will be an entry in 37 00:01:52,108 --> 00:01:55,980 that list. And it gives you two values, the, the 38 00:01:55,980 --> 00:01:59,634 weight of the, the value of the, of the item and then the weight of the, the, of 39 00:01:59,634 --> 00:02:04,200 the item, okay. so that's the input. 40 00:02:04,200 --> 00:02:07,190 Obviously we would expect you to submit something. 41 00:02:07,190 --> 00:02:09,870 Right? At least, you know, if you want a grade. 42 00:02:09,870 --> 00:02:11,586 Okay? And essentially, this is the format that 43 00:02:11,586 --> 00:02:13,890 we expect you for this particular assignment. 44 00:02:13,890 --> 00:02:15,618 Okay? You're going to give the objective 45 00:02:15,618 --> 00:02:17,867 function. You're going to give a flag which is opt, 46 00:02:17,867 --> 00:02:21,575 and I'll come back to that in a moment, but this is a zero one value. 47 00:02:21,575 --> 00:02:24,278 Okay? And then you will basically output the 48 00:02:24,278 --> 00:02:26,544 solutions. And in this particular case, it's 49 00:02:26,544 --> 00:02:29,580 going to be a long list of zero and ones. Okay? 50 00:02:29,580 --> 00:02:33,048 Telling you whether, you know, this item has been selected in the solution or not, 51 00:02:33,048 --> 00:02:35,680 okay? If we have n items, okay? 52 00:02:35,680 --> 00:02:39,590 There will be n entries inside that list. Okay? 53 00:02:39,590 --> 00:02:41,980 So this is a very particular example, okay? 54 00:02:41,980 --> 00:02:46,194 So what you see here. This is essentially a knapsack problem 55 00:02:46,194 --> 00:02:49,800 with four item. Really challenging, right? 56 00:02:49,800 --> 00:02:53,999 And the capacity of the knapsack is 11. And then what you see in this list is 57 00:02:53,999 --> 00:02:57,400 the, the values and the weights of every one of the item. 58 00:02:57,400 --> 00:03:00,991 Okay, so the first one has a value of eight and a weight of four, and the 59 00:03:00,991 --> 00:03:05,370 second one has a value of ten and a weight of five. 60 00:03:05,370 --> 00:03:07,609 Okay. And so this is, this is an example of 61 00:03:07,609 --> 00:03:09,890 input data file. Okay. 62 00:03:09,890 --> 00:03:14,270 So you can expect to have some that will be a little bit longer than this one. 63 00:03:14,270 --> 00:03:17,406 And this is your output here, you are basically saying, oh, this is a solution 64 00:03:17,406 --> 00:03:20,880 which has a value 19. This is a, the, the sum of the value of 65 00:03:20,880 --> 00:03:24,042 all the item. I have absolutely no clue whether this is 66 00:03:24,042 --> 00:03:26,174 optimal. That's why you put a zero there, this is 67 00:03:26,174 --> 00:03:28,300 this flag. This flag is telling you I don't know if 68 00:03:28,300 --> 00:03:30,970 this is optimal, I haven't proven it. Okay? 69 00:03:30,970 --> 00:03:33,532 if it's a one that means that oh, I have proved that this is the ultimate 70 00:03:33,532 --> 00:03:35,914 solution. And then you see the particular solution 71 00:03:35,914 --> 00:03:36,870 here. Okay? 72 00:03:36,870 --> 00:03:39,925 The solution here is zero zero one one which basically means that I select item 73 00:03:39,925 --> 00:03:43,189 three and four. If you look at item three and four, you 74 00:03:43,189 --> 00:03:46,660 see that the value is indeed 19. And then you see that the capacity is 75 00:03:46,660 --> 00:03:49,590 indeed 11, so this is a feasible solution. 76 00:03:49,590 --> 00:03:52,938 Though, as I told you many times, this is one of those beautiful features of 77 00:03:52,938 --> 00:03:55,250 NP-complete problems. Okay? 78 00:03:55,250 --> 00:03:58,538 I can check very quickly. If what you submit is actually a 79 00:03:58,538 --> 00:04:01,198 solution, okay. And obviously I can very quickly compute 80 00:04:01,198 --> 00:04:03,360 the objective function. Okay. 81 00:04:03,360 --> 00:04:06,220 So this is what I can always do, you know, when you are submitting an 82 00:04:06,220 --> 00:04:09,180 assignment. Okay. 83 00:04:09,180 --> 00:04:11,440 Now, how do you actually implement the assignment? 84 00:04:11,440 --> 00:04:14,777 Remember, you know, we talked about these two, you know scripts, Python scripts, 85 00:04:14,777 --> 00:04:18,188 that we have. The first one is the solver script, which 86 00:04:18,188 --> 00:04:22,080 basically is the core of your assignment, okay. 87 00:04:22,080 --> 00:04:25,776 So what you will have is essentially, you know, the input and the output and then 88 00:04:25,776 --> 00:04:29,740 in the middle your solver. Okay, so when we provide this script, 89 00:04:29,740 --> 00:04:32,940 we'll always have some code that outputs are completely simple solution, in 90 00:04:32,940 --> 00:04:34,780 general. Okay? 91 00:04:34,780 --> 00:04:39,492 And so, so we'll take the, so the input is going to basically be this file here, 92 00:04:39,492 --> 00:04:42,546 okay. Giving you all the data that you need to 93 00:04:42,546 --> 00:04:45,200 compute it. And basically the first lines here are 94 00:04:45,200 --> 00:04:48,555 basically parsing that input data to output a very simple solution in the 95 00:04:48,555 --> 00:04:53,084 script we are giving you, okay. Obviously you want to, you will want to 96 00:04:53,084 --> 00:04:56,256 replace this simple solution with something which is you know, much more 97 00:04:56,256 --> 00:04:59,530 sophisticated. And then the last part here, you know 98 00:04:59,530 --> 00:05:02,911 what you're computing here, what you're outputting there is basically the output 99 00:05:02,911 --> 00:05:06,440 and this output will have to have this format. 100 00:05:06,440 --> 00:05:08,709 Okay? So this is the critical part here. 101 00:05:08,709 --> 00:05:11,000 Okay, the output, the out, sorry, the output here. 102 00:05:11,000 --> 00:05:14,200 The output here. Okay. 103 00:05:14,200 --> 00:05:17,208 So, essentially, so what you see, once again to summarize, you have an input, 104 00:05:17,208 --> 00:05:20,404 you parse the input, you have the output, okay, that you generate, and they have to 105 00:05:20,404 --> 00:05:24,150 correspond to these two formats that you have. 106 00:05:24,150 --> 00:05:27,318 In the middle we provide you some code which is actually outputting a very basic 107 00:05:27,318 --> 00:05:30,610 solution, and that's where you put your solver, okay. 108 00:05:30,610 --> 00:05:33,760 Now, your solver doesn't have to be implemented in Python, okay. 109 00:05:33,760 --> 00:05:36,078 Some people may want to do that, other peoples may, you know, not want to do 110 00:05:36,078 --> 00:05:39,431 that, okay. So you can actually call different 111 00:05:39,431 --> 00:05:42,714 processes, you know, inside, you know, inside this python script, such that you 112 00:05:42,714 --> 00:05:45,801 can call your Java program or your C program or, you know, whatever language 113 00:05:45,801 --> 00:05:49,550 you like. Okay, and typically this is, this, this 114 00:05:49,550 --> 00:05:52,920 will be done by, you know, coding a different process here. 115 00:05:52,920 --> 00:05:57,012 You will open a process okay, telling you which language you want to use and, and 116 00:05:57,012 --> 00:06:01,955 various processes for that, various files for that languages. 117 00:06:01,955 --> 00:06:04,649 Okay? And then you can, and obviously you know 118 00:06:04,649 --> 00:06:07,400 what you see here is that we selected Java. 119 00:06:07,400 --> 00:06:11,080 And this is going to be the Java code for actually running your solver. 120 00:06:11,080 --> 00:06:13,444 Okay? So, in a sense, what is happening here is 121 00:06:13,444 --> 00:06:16,180 that the input is, you know, the input of the Python script is going to be 122 00:06:16,180 --> 00:06:20,690 transmitted to your Java program. And obviously your Java program here is 123 00:06:20,690 --> 00:06:24,580 going to output something which will be transmitted to the Python script. 124 00:06:24,580 --> 00:06:26,420 Okay? So, in this particular case, we used a 125 00:06:26,420 --> 00:06:29,050 very simple ways, simple way of doing this. 126 00:06:29,050 --> 00:06:32,082 We used a standard output. So that, you know, Java is outputting on 127 00:06:32,082 --> 00:06:34,678 the standard output and Python is retrieving that, and generating the 128 00:06:34,678 --> 00:06:37,700 output solution. But you can do that in many different 129 00:06:37,700 --> 00:06:40,016 ways. You can use a, you know, a database, you 130 00:06:40,016 --> 00:06:44,220 can use potent you know, a file, you know, whatever you want. 131 00:06:44,220 --> 00:06:47,340 But essentially at some point, the Java program has to tell the Python program 132 00:06:47,340 --> 00:06:51,270 where the data is. You know, Python should actually get that 133 00:06:51,270 --> 00:06:55,022 data and output it in exactly, you know, the format that we expect for grading the 134 00:06:55,022 --> 00:06:57,240 assignment. Okay? 135 00:06:57,240 --> 00:07:01,208 So, basic messages here, you know, you basically change this, this, this you 136 00:07:01,208 --> 00:07:05,300 know, solver script such that you call a server, or you implement your server in 137 00:07:05,300 --> 00:07:08,100 Python. Okay? 138 00:07:08,100 --> 00:07:10,820 And then, you output the data in the right format. 139 00:07:10,820 --> 00:07:12,526 Okay? Once again, you know, we give you a lot 140 00:07:12,526 --> 00:07:16,010 of this infrastructure. We show you how to parse the data. 141 00:07:16,010 --> 00:07:18,540 We show you how to, you know, format the output. 142 00:07:18,540 --> 00:07:22,434 And we also show you a basic, you know, a basic solution for actually computing a 143 00:07:22,434 --> 00:07:25,650 solution. But most of the interesting part is 144 00:07:25,650 --> 00:07:29,367 going to be for you to actually replace that basic solution and, and, and write 145 00:07:29,367 --> 00:07:33,656 your own solvers, okay. So, testing the solver once again, we saw 146 00:07:33,656 --> 00:07:36,940 that in the first lecture, in this particular case. 147 00:07:36,940 --> 00:07:40,900 You basically will take your solver and then one of your data files, okay? 148 00:07:40,900 --> 00:07:45,330 So this is the data files, this is the data that you saw here, right, okay? 149 00:07:45,330 --> 00:07:49,325 So that's what you are basically passing to the python script, okay? 150 00:07:49,325 --> 00:07:53,717 And, you know, you run your solver and your solver will obviously output a 151 00:07:53,717 --> 00:07:57,662 particular solution. In this particular case, this output of 152 00:07:57,662 --> 00:08:00,434 solution 18, you know, you have no clue whether it's optimal or not, and it's 153 00:08:00,434 --> 00:08:02,280 not. Okay? 154 00:08:02,280 --> 00:08:05,030 And then you see the solution which is one one zero zero. 155 00:08:05,030 --> 00:08:07,160 Okay? So that's how you can test your solver. 156 00:08:07,160 --> 00:08:10,040 At this point, you know, you know that the script is working and you could 157 00:08:10,040 --> 00:08:14,740 actually submit this solution to you know, to the, the course, to the course. 158 00:08:16,320 --> 00:08:19,048 And what you will do is essentially what we have seen in the first, you know, 159 00:08:19,048 --> 00:08:21,890 assignment. You will have to enter your credential. 160 00:08:21,890 --> 00:08:25,341 You know, the login and the password. And then in this particular case, you 161 00:08:25,341 --> 00:08:27,800 will get a menu. And that menu will tell you, okay, so 162 00:08:27,800 --> 00:08:32,186 these are the various problems for which you can actually produce a solution. 163 00:08:32,186 --> 00:08:35,518 And one of the options that you have here is basically choosing the sub-classes of 164 00:08:35,518 --> 00:08:38,780 problems for which you want to generate the solution. 165 00:08:38,780 --> 00:08:41,020 You can choose the mode if you want, okay? 166 00:08:41,020 --> 00:08:44,640 You can choose two of them or one of them or anything you want. 167 00:08:44,640 --> 00:08:48,151 Okay, essentially, okay? So, for instance, you can say, oh, you 168 00:08:48,151 --> 00:08:52,570 know, the solver that I have is really good on, you know, problem three, okay? 169 00:08:52,570 --> 00:08:55,853 And basically, you choose problem three, okay, and at this point, your solver is 170 00:08:55,853 --> 00:08:59,920 run for problem three, okay? You can choose, you know, let's say, 171 00:08:59,920 --> 00:09:04,474 problem zero or you can choose you can choose you know, four and six, and what 172 00:09:04,474 --> 00:09:08,500 you get is essentially a the, the, your summary is going to be executed on any 173 00:09:08,500 --> 00:09:16,137 one of these things. Okay, when you put zero, you test 174 00:09:16,137 --> 00:09:18,810 everything obviously. so the output. 175 00:09:18,810 --> 00:09:20,760 Okay? So, one of these flag that I discussed is 176 00:09:20,760 --> 00:09:22,490 the optimality flag. Okay? 177 00:09:22,490 --> 00:09:26,560 So, sometimes we'll have a, a, you will, you will, you, your assignment. 178 00:09:26,560 --> 00:09:30,870 Your solver will be able to prove optimality when you do so, essentially. 179 00:09:30,870 --> 00:09:34,330 You want, you, you, you may want to put the value 1 over there. 180 00:09:34,330 --> 00:09:37,762 It's basically telling us that not only you found the optimal solution, but you 181 00:09:37,762 --> 00:09:40,290 actually proved it. Okay. 182 00:09:40,290 --> 00:09:42,870 So we are not using that information for your grade. 183 00:09:42,870 --> 00:09:46,150 But this is something that is actually very valuable for us to know. 184 00:09:46,150 --> 00:09:49,094 And this will appear in the little boat and saying, you know, and, and tell the 185 00:09:49,094 --> 00:09:53,530 world that you have the optimal solution. And you have proven that it's optimal, 186 00:09:53,530 --> 00:09:56,665 nobody can improve what it is. Okay? 187 00:09:56,665 --> 00:10:01,005 so, in the little boat that's going to be highlighted by the fact that there will 188 00:10:01,005 --> 00:10:05,250 be a bold face on the value of your solution. 189 00:10:05,250 --> 00:10:08,310 So you see the solution here, 19, it's optimal, okay. 190 00:10:08,310 --> 00:10:10,870 Now one of the things that you have to do is you have to be careful if you put this 191 00:10:10,870 --> 00:10:13,688 flag, okay. So if you put, you know, the optimality 192 00:10:13,688 --> 00:10:16,873 flag to 1, and your solution is 10.5, you are not going to good, look you're not 193 00:10:16,873 --> 00:10:23,695 going to look good, okay. Because there will be obviously solution 194 00:10:23,695 --> 00:10:26,560 that have, you know, that may improve it, okay. 195 00:10:26,560 --> 00:10:28,918 Also, you don't want to cheat. You know, you may have the optimal 196 00:10:28,918 --> 00:10:31,770 solution, but you may not have been able to, to prove it, okay, so don't, you 197 00:10:31,770 --> 00:10:36,187 know, don't abuse the system, once again. You know, just you know, put the right 198 00:10:36,187 --> 00:10:39,916 value for the optimality flag. Once again, your grade doesn't depend on 199 00:10:39,916 --> 00:10:42,556 it, but it's a useful information to know, so that people know what is the 200 00:10:42,556 --> 00:10:47,088 best they could ever do. Okay, so typically when we have these 201 00:10:47,088 --> 00:10:51,180 assignments video we'll give you some tips on, you know, things to think about, 202 00:10:51,180 --> 00:10:54,758 okay. So these are some tips on the knapsack 203 00:10:54,758 --> 00:10:58,296 assignment, so if you do dynamic programming one of the big thing you have 204 00:10:58,296 --> 00:11:03,265 to think about is the space, okay? Dynamic programming when it works is very 205 00:11:03,265 --> 00:11:06,352 fast, but it may use a lot of space, so think about the space that a particular 206 00:11:06,352 --> 00:11:09,860 solution is using and whether it's practical. 207 00:11:09,860 --> 00:11:13,770 And think about ways our activity we're using the space if you can, okay? 208 00:11:13,770 --> 00:11:15,970 So, this is a critical issue in dynamic programming. 209 00:11:15,970 --> 00:11:18,030 Think about it when you do the assignment. 210 00:11:18,030 --> 00:11:19,964 Okay. Now, when you do the branch and when you 211 00:11:19,964 --> 00:11:23,600 implement a branch and bound algorithm, space is not an issue, okay. 212 00:11:23,600 --> 00:11:26,711 So what is really an issue, well in general, okay, so it depends also on the 213 00:11:26,711 --> 00:11:30,164 search strategy. But one of the big things in this 214 00:11:30,164 --> 00:11:33,594 particular case is actually bounding, and finding good bounds, and also computing 215 00:11:33,594 --> 00:11:38,600 these bounds very fast, because you will compute them many, many times, okay. 216 00:11:38,600 --> 00:11:41,886 So, if you improve the way to compute a bound, okay, you improve the overall 217 00:11:41,886 --> 00:11:45,480 efficiency of the branch and bound algorithm. 218 00:11:45,480 --> 00:11:48,744 So, in this particular way from the knapsack problem, think about how you can 219 00:11:48,744 --> 00:11:52,212 actually implement, okay, the best value, the best bounding techniques, and how 220 00:11:52,212 --> 00:11:55,640 fast you can actually compute it. Okay? 221 00:11:55,640 --> 00:11:58,690 So, good luck, this is your first assignment, okay? 222 00:11:58,690 --> 00:12:01,810 So, you know, you will experience our infrastructure and you will first 223 00:12:01,810 --> 00:12:04,982 experience, you know, you know, NP-complete on a very simple problems, I 224 00:12:04,982 --> 00:12:07,145 must admit. Okay. 225 00:12:07,145 --> 00:12:07,670 Thank you.