1 00:00:00,012 --> 00:00:05,107 Now we'll see how our understanding of basic properties of random walks on the 2 00:00:05,107 --> 00:00:10,077 non-negative integers unlocks the analysis of Papadimitriou's Algorithm, 3 00:00:10,077 --> 00:00:15,097 showing that randomized local search yields a polynomial time algorithm for 4 00:00:15,097 --> 00:00:18,542 the 2-SAT problem. Let me remind you the details of 5 00:00:18,542 --> 00:00:23,756 Papadimitirou's true set algorithm. consider an instance of two set it has n 6 00:00:23,756 --> 00:00:29,324 bullion variables, the algorithm has two four loops, the outer four loop its only 7 00:00:29,324 --> 00:00:33,374 responsibility is to run log in independent random trials. 8 00:00:33,374 --> 00:00:38,992 In each iteration of this outer four loop we're just going to do randomized local 9 00:00:38,992 --> 00:00:41,422 search. So we begin with an initial random 10 00:00:41,422 --> 00:00:44,394 assignment. We put the [UNKNOWN] independently for 11 00:00:44,394 --> 00:00:48,738 [UNKNOWN] and variable to the time and whether it starts of true or starts of 12 00:00:48,738 --> 00:00:51,407 false. Then we get a budget of 2n^2 local moves 13 00:00:51,407 --> 00:00:55,895 in an attempt transform this initial assignment into one that satisfies In 14 00:00:55,895 --> 00:01:00,277 equities 2n^2 inner iterations we first check if the current assignment is 15 00:01:00,277 --> 00:01:02,806 satisfying. Obviously if it is we're done. 16 00:01:02,806 --> 00:01:07,636 So suppose that the current assignment is not a satisfying assignment then there is 17 00:01:07,636 --> 00:01:12,196 at least 1 possibly many unsatisfied clauses, of them we take 1, arbitrarily. 18 00:01:12,196 --> 00:01:16,461 Its a 2-SAT instance, so this cause involves 2 variables and if we flip the 19 00:01:16,461 --> 00:01:20,498 value of either of those 2 variables we're going to make amends over this 20 00:01:20,498 --> 00:01:23,096 clause. It will flip from one side Satisy to 21 00:01:23,096 --> 00:01:25,782 satisfy. There's a tricky question of how you pick 22 00:01:25,782 --> 00:01:29,795 which of the two variables you flip. Well, we take the easy way out, we just 23 00:01:29,795 --> 00:01:32,772 pick one of the two uniformly [INAUDIBLE] at random. 24 00:01:32,772 --> 00:01:37,126 If at the end of the day, after these 2n^2*LogN local moves, Papadimitriou's 25 00:01:37,126 --> 00:01:41,452 algorithm has failed to find a satisfying assignment, it somewhat arrogantly 26 00:01:41,452 --> 00:01:43,909 declares that therefore one must not exist. 27 00:01:43,909 --> 00:01:47,772 What's clear about this algorithm is that is runs in polynomial time. 28 00:01:47,772 --> 00:01:52,234 After all, we've explicitly bounded the number of steps it's permitted to take. 29 00:01:52,234 --> 00:01:57,095 Secondly, it's clear that if there is no satisfying assignment Then the algorithm 30 00:01:57,095 --> 00:02:02,238 will naturally fail to find one and correctly report that it's unsatisfiable. 31 00:02:02,238 --> 00:02:06,576 What's not at all obvious is the probability that Papadimitriou's 32 00:02:06,576 --> 00:02:11,158 algorithm will find a satisfying assignment in the instances where at 33 00:02:11,158 --> 00:02:15,790 least one such assignment exists. That analysis is the subject of this 34 00:02:15,790 --> 00:02:18,263 video. And again what's remarkable and 35 00:02:18,263 --> 00:02:22,872 surprising about this theorem is that if you think about the utmost natural 36 00:02:22,872 --> 00:02:27,864 measure of progress for Papadimitriou's algorithm, mainly the number of clauses 37 00:02:27,864 --> 00:02:32,337 which are currently satisfied. That number can actually go down when you 38 00:02:32,337 --> 00:02:35,416 make a local step. By flipping the assignment to one 39 00:02:35,416 --> 00:02:39,744 variable, yeah, you fix the one clause, but you might screw up a bunch of others, 40 00:02:39,744 --> 00:02:44,146 leaving you with an assignment that seems worse, in a sense that even fewer clauses 41 00:02:44,146 --> 00:02:47,578 are satisfied than before. Nonetheless, if you define the right 42 00:02:47,578 --> 00:02:51,914 measure of progress, you can argue, by a random waka analogy, that you are making 43 00:02:51,914 --> 00:02:55,868 progress over time, resulting, eventually with a satisfying assignment. 44 00:02:55,868 --> 00:02:59,573 Let's see the details. Each iteration of the outer 4 loop 45 00:02:59,573 --> 00:03:04,071 represents a fresh opportunity to discover a satisfying assignment. 46 00:03:04,071 --> 00:03:09,151 All of these iterations of the outer 4 loop are identical, they just used fresh 47 00:03:09,151 --> 00:03:12,738 random coins. So let's just analyze for now the success 48 00:03:12,738 --> 00:03:15,792 probability of a particular outer iteration. 49 00:03:15,792 --> 00:03:19,899 We're assuming that this instance is satisfiable, so there's, there's at 50 00:03:19,899 --> 00:03:22,453 least, perhaps many satisfying assignments. 51 00:03:22,453 --> 00:03:25,659 Pick 1, an arbitrary one, I don't care which, call it A*. 52 00:03:25,659 --> 00:03:29,999 Lets denote, by A sub t, the assignment that [UNKNOWN] Demetri's algorithm is 53 00:03:29,999 --> 00:03:34,020 considering, after t iterations of the inner four-loop have completed. 54 00:03:34,020 --> 00:03:38,227 So, A sub 0, that's the random initial assignment with which we begin, this 55 00:03:38,227 --> 00:03:42,617 iteration of the outer four-loop. And then, after we've Done t local flips 56 00:03:42,617 --> 00:03:47,682 that results in the assignment a sub t. This is of course a random variable, its 57 00:03:47,682 --> 00:03:51,933 a function of the random choices made by Papadimitriou algorithm. 58 00:03:51,933 --> 00:03:56,923 Now the great idea in this analysis is to use the right progress measure rather 59 00:03:56,923 --> 00:04:00,318 than clauses that the current assignment satisfies. 60 00:04:00,318 --> 00:04:05,070 We're going to count the number of variable assignments that it got right, 61 00:04:05,070 --> 00:04:09,522 that is the number of variable assignments with which it agrees. 62 00:04:09,522 --> 00:04:13,463 With this reference satisfying assignment A star. 63 00:04:13,463 --> 00:04:19,655 Evidently, X sub T is an integer, it's 0 if A assigns every variable the opposite 64 00:04:19,655 --> 00:04:25,176 of what A star does and it's equal to N if A and A star are exactly the same 65 00:04:25,176 --> 00:04:29,122 assignment. Of course, if we ever get to the point 66 00:04:29,122 --> 00:04:34,797 where x sub t=n, and a star is the same a s at, then at is itself a satisfying 67 00:04:34,797 --> 00:04:39,692 assignment and we're good to go. The algorithm halts correctly. 68 00:04:39,692 --> 00:04:44,606 Now we come to the key point in the analysis, now we're going to understand 69 00:04:44,606 --> 00:04:48,581 the sense in which the algorithm on average makes progress. 70 00:04:48,581 --> 00:04:54,013 So suppose we're currently looking at the assignment a sub t and suppose it's not a 71 00:04:54,013 --> 00:04:59,296 satisfying assignment, then there's one or more clauses that are unsatisfied. 72 00:04:59,296 --> 00:05:04,622 The algorithm picks one of those Let's say it's a clause involving the variables 73 00:05:04,622 --> 00:05:08,100 X of I and X of J. For example, if it looks at the clause 74 00:05:08,100 --> 00:05:12,052 not X of 3 or X of 7, then it's involving the variables X of 3. 75 00:05:12,052 --> 00:05:14,911 And X of 7. So here's an easy observation. 76 00:05:14,911 --> 00:05:19,340 Our current assignment A sub T is failing to satisfy this clause. 77 00:05:19,340 --> 00:05:24,504 The reference satisfying assignment, A star, naturally is satisfying this 78 00:05:24,504 --> 00:05:27,693 clause. Therefore A star must give a different 79 00:05:27,693 --> 00:05:32,732 value to one of the 2 variables XI or XJ, than our assignment A sub T does. 80 00:05:32,732 --> 00:05:38,047 It is also possible of course, that A star makes different assignments to both 81 00:05:38,047 --> 00:05:42,652 of the variables, XI and XJ. All we know is that A star satisfies the 82 00:05:42,652 --> 00:05:45,967 clause, whereas our assignment A sub T does not. 83 00:05:45,967 --> 00:05:51,510 And the amazing part is just this simple observation is enough to imply that we're 84 00:05:51,510 --> 00:05:55,812 going to make progress on average. Here's exactly what I mean. 85 00:05:55,812 --> 00:06:00,477 So Papadimitriou's algorithm is going to choose either xi or xj, that choice is 86 00:06:00,477 --> 00:06:04,922 uniform at random, and it's going to flip the value currently assigned to that 87 00:06:04,922 --> 00:06:07,642 variable. Now, the situation in which this is 88 00:06:07,642 --> 00:06:12,456 guaranteed to be good for us Is where we gave the opposite assignment to both xi 89 00:06:12,456 --> 00:06:15,051 and xj, relative to the assignment a star. 90 00:06:15,051 --> 00:06:19,806 If both of our value assignments of these variables was the opposite of a star, 91 00:06:19,806 --> 00:06:24,012 then whichever one we flip, we are going to share one more variable. 92 00:06:24,012 --> 00:06:27,162 In agreement, with a star, than we did before. 93 00:06:27,162 --> 00:06:32,362 That is, in our previous notation, X of T plus one, the number of variables in 94 00:06:32,362 --> 00:06:37,187 which we agree next time step will be guaranteed to be one more than it is 95 00:06:37,187 --> 00:06:40,437 right now. The situation is more subtle when the 96 00:06:40,437 --> 00:06:45,612 current assignment A sub T agrees with the reference satisfying assignment A 97 00:06:45,612 --> 00:06:49,036 star on exactly one of the two variables. XI and XJ. 98 00:06:49,036 --> 00:06:52,924 So, for example maybe the current assignment agrees with A starred Xi but 99 00:06:52,924 --> 00:06:55,784 disagrees with the reference assignment A star on XJ. 100 00:06:55,784 --> 00:07:00,070 Now of course the algorithm has no idea what A star is, this arbitrary satisfying 101 00:07:00,070 --> 00:07:03,975 assignment, so it has no idea which two of the variables XI or XJ it should 102 00:07:03,975 --> 00:07:06,372 flipped, it just picked one of the two at. 103 00:07:06,372 --> 00:07:10,494 Random. Now if we are lucky and we flip the value 104 00:07:10,494 --> 00:07:16,849 of the variable on which a sub t at a* disagree and this is just like case one. 105 00:07:16,849 --> 00:07:23,398 The flip causes the 2 assignments to agree on one more variable than they did 106 00:07:23,398 --> 00:07:27,842 previously as a consequence X sub t+1 is going to. 107 00:07:27,842 --> 00:07:32,189 Going to be one more than x of t. In the unlucky case we wind up flipping 108 00:07:32,189 --> 00:07:37,349 the variable on which we agreed already with the reference assignment a star. 109 00:07:37,349 --> 00:07:42,386 Then once we flip it we're going to disagree on yet another variable with the 110 00:07:42,386 --> 00:07:46,708 reference assignment a star. So in that case x of t+1 is actually 111 00:07:46,708 --> 00:07:48,624 going to be one less. Then x of t. 112 00:07:48,624 --> 00:07:52,779 And now at this point I hope things are starting to look a little familiar 113 00:07:52,779 --> 00:07:57,149 relative to our previous video about random [INAUDIBLE] in the non-negative 114 00:07:57,149 --> 00:07:59,919 integers. There we again had a random variable, 115 00:07:59,919 --> 00:08:04,275 namely the position of the walk and at each time step it either went up by 1 or 116 00:08:04,275 --> 00:08:06,896 down by 1. And with 50/50 probability each. 117 00:08:06,896 --> 00:08:11,391 So the situation here with the x sub ts is clearly not identical to random locks 118 00:08:11,391 --> 00:08:15,962 on non-negative integers, so the next quiz asks you to identify the differences 119 00:08:15,962 --> 00:08:20,558 between the random process governing the x sub ts and your vanilla random lock on 120 00:08:20,558 --> 00:08:24,162 the non-negative integers as studied in the previous video. 121 00:08:24,162 --> 00:08:26,609 Video. So the answer is D. 122 00:08:26,609 --> 00:08:34,178 If we think about 2 random processes, the 1st one is your straight forward random 123 00:08:34,178 --> 00:08:40,017 walk in non-negative integers as discussed in the last video. 124 00:08:40,017 --> 00:08:47,422 The 2nd random process is the evolution of these random variables, the X sub. 125 00:08:47,422 --> 00:08:49,721 T's. The number of variables on which the 126 00:08:49,721 --> 00:08:53,728 current assignment of Papnemeter's algorithm agrees with the reference 127 00:08:53,728 --> 00:08:57,297 satisfying assignment a star, there are exactly 3 differences. 128 00:08:57,297 --> 00:09:00,941 One we already pointed out. In the vanilla random walk in the line 129 00:09:00,941 --> 00:09:05,392 ex, lets your position 0, if you're any bigger position its always 50-50 left or 130 00:09:05,392 --> 00:09:07,763 right. In Papnemeters algorithm you might 131 00:09:07,763 --> 00:09:12,214 sometimes have a 0% probability of moving to the left, a 100% chance of moving to 132 00:09:12,214 --> 00:09:15,051 the right. That's when the algorithm zooms in on an 133 00:09:15,051 --> 00:09:19,826 unsatisfied clause in which the current assignment a-sub t happens to differ with 134 00:09:19,826 --> 00:09:23,117 the reference assignment a-star on both of the two variables. 135 00:09:23,117 --> 00:09:27,262 Then it doesn't matter which of the 2 variables you pick, you're guaranteed to 136 00:09:27,262 --> 00:09:31,357 agree with the reference assignment a-star on one more variable than before. 137 00:09:31,357 --> 00:09:35,502 The second difference is that when we studied random walks, by definition, we 138 00:09:35,502 --> 00:09:39,672 always started at position 0. That would correspond to this first 139 00:09:39,672 --> 00:09:42,347 random variable x naught being equal to 0. 140 00:09:42,347 --> 00:09:45,612 However, in general x naught will not be equal to 0. 141 00:09:45,612 --> 00:09:48,357 In general it'll be strictly bigger than 0. 142 00:09:48,357 --> 00:09:53,195 Why is that? Well, remember x naught is defined as The number of variables that 143 00:09:53,195 --> 00:09:56,997 have the same value in the reference assignment A* and the initial random 144 00:09:56,997 --> 00:09:59,741 assignment A0. So, the only way that X0 is going to be 0 145 00:09:59,741 --> 00:10:03,828 is that you randomly chose an assignment where every variable had the opposite 146 00:10:03,828 --> 00:10:06,832 value as what it is A* and that's very unlikely to happen. 147 00:10:06,832 --> 00:10:10,748 Generally, you start with the random assignment, you're going to get lucky and 148 00:10:10,748 --> 00:10:14,472 some of the values will be correct, will be the same as in A*, so X0 will 149 00:10:14,472 --> 00:10:19,182 generally start bigger Than zero. Here's the third and final difference. 150 00:10:19,182 --> 00:10:24,682 In the random walks that we studied, the process by definition, stopped, exactly 151 00:10:24,682 --> 00:10:29,782 the first time you reach position N. Now Parvati Metri's algorithm, what is 152 00:10:29,782 --> 00:10:33,626 true, is if you ever get to Points that X of T equals N. 153 00:10:33,626 --> 00:10:37,213 Then at that point the algorithm will terminate. 154 00:10:37,213 --> 00:10:42,904 Why? Well, if X of T equals N, that means in the assignment A sub T every single 155 00:10:42,904 --> 00:10:48,342 variable has exactly the same value as in the reference assignment A star. 156 00:10:48,342 --> 00:10:52,327 Now, A stars are satisfying assignment and Papadimitriou's algorithm halts as 157 00:10:52,327 --> 00:10:56,397 soon as it finds a satisfying assignment. So, if x of t=n, you've found a star, 158 00:10:56,397 --> 00:10:59,657 you're good to go, you stop. However, the converse is not true. 159 00:10:59,657 --> 00:11:03,637 Papadimitriou's algorithm might halt even though the current value of x of t is 160 00:11:03,637 --> 00:11:04,954 less than. With an n. 161 00:11:04,954 --> 00:11:09,692 Why is that ? or remember X of t measures your distance from this just 1 satisfying 162 00:11:09,692 --> 00:11:12,591 assignment a*. Nobody said that was the only one. 163 00:11:12,591 --> 00:11:16,603 There might be other satisfying assignments out there in the world as 164 00:11:16,603 --> 00:11:21,305 well and Papadimitriou's algorithm might well stumble up on one of them before it 165 00:11:21,305 --> 00:11:24,101 finds a*. So it holds the first time it finds the 166 00:11:24,101 --> 00:11:28,205 satisfying assignment and that can happen before X of t equal N. 167 00:11:28,205 --> 00:11:32,584 So where does this leave us? On the one hand it does seem to be a strong metaphor 168 00:11:32,584 --> 00:11:36,824 between the distance between the assignment in Papademetris' algorithm and 169 00:11:36,824 --> 00:11:40,834 this reference assignment A star and the position of a random walk on the 170 00:11:40,834 --> 00:11:44,392 non-negative integers. On the other hand, there's kind of a lot 171 00:11:44,392 --> 00:11:47,352 of differences. Some of them are pretty subtle. 172 00:11:47,352 --> 00:11:52,011 So what you gotta be wondering is was that whole analysis last video useful. 173 00:11:52,011 --> 00:11:57,010 Well, what's amazing and what allows us to transfer the random walk analysis over 174 00:11:57,010 --> 00:12:01,619 to that of Apototometri's algorithm is that every single one of these three 175 00:12:01,619 --> 00:12:04,782 discrepancies between the two random processes. 176 00:12:04,782 --> 00:12:08,132 Only helps us, it only means the algorithm is going to terminate even 177 00:12:08,132 --> 00:12:11,512 faster than we might have suspected from the random walk analysis. 178 00:12:11,512 --> 00:12:15,432 So here's another way to think about it. Imagine I force you to run Padamitris 179 00:12:15,432 --> 00:12:18,682 algorithm under an extremely unfortunate set of circumstances. 180 00:12:18,682 --> 00:12:22,792 So first I give you a 2-SAT instance with just 1 satisfying assignment, so there's 181 00:12:22,792 --> 00:12:26,387 a unique satisfying assignment A star, you're looking for a needle in a 182 00:12:26,387 --> 00:12:29,026 haystack. There's no way you can get lucky by 183 00:12:29,026 --> 00:12:31,952 halting early with a different satisfying assignment. 184 00:12:31,952 --> 00:12:36,105 Secondly, I force you to begin from the worst possible initial assignment where 185 00:12:36,105 --> 00:12:40,424 every variable's value initially is the wrong one, is flipped from what it should 186 00:12:40,424 --> 00:12:43,510 be in a star. And finally imagine I further contrive 187 00:12:43,510 --> 00:12:47,861 things so that whenever you pick unsatisfied clause to do a variable flip 188 00:12:47,861 --> 00:12:52,428 I force you to pick a clause where one of the variables is set the same as an a* 189 00:12:52,428 --> 00:12:55,343 and the other variable is set the opposite of a*. 190 00:12:55,343 --> 00:12:59,842 Well for this kind of execution of [UNKNOWN] algorithm, the correspondence 191 00:12:59,842 --> 00:13:04,334 between the evolution, the random variables, the x of t's and a random walk 192 00:13:04,334 --> 00:13:10,154 on the non negative integers Is perfect. The two random processes are identical 193 00:13:10,154 --> 00:13:13,647 under this worse case set of circumstances. 194 00:13:13,647 --> 00:13:18,906 Therefore in the worst case circumstances, the probability of the 195 00:13:18,906 --> 00:13:25,219 Papadimitriou algorithm fails to find a satisfying assignment, that is a star, 196 00:13:25,219 --> 00:13:31,087 within 2n^2 steps is exactly the same as the probability that a random walk 197 00:13:31,087 --> 00:13:35,688 beginning at 0 Fails to reach position n within 2n^2 steps and we analyse that 198 00:13:35,688 --> 00:13:39,659 failure probability in the previous video, we proved its of most one half, 199 00:13:39,659 --> 00:13:42,486 that is the success probability is at least one half. 200 00:13:42,486 --> 00:13:46,059 Now if you look at any other circumstances other than the worst case 201 00:13:46,059 --> 00:13:49,499 circumstances power meters algorithm is most likely to succeed. 202 00:13:49,499 --> 00:13:53,412 So first of all it doesn't start as far away from a satisfying assignment. 203 00:13:53,412 --> 00:13:57,474 Second of all this other satisfying assignments you might be able to find and 204 00:13:57,474 --> 00:14:01,930 thirdly sometimes its guaranteed Need to make progress, as a point to have, as 205 00:14:01,930 --> 00:14:06,249 opposed to only having a 50/50 chance. So the success probability under any 206 00:14:06,249 --> 00:14:10,946 circumstances upon nature's algorithm is at least 50% in a given iteration of this 207 00:14:10,946 --> 00:14:13,762 outer form. And now we're home free. 208 00:14:13,762 --> 00:14:20,351 The probability that a given iteration of the outer for loop fails is at most 1/2, 209 00:14:20,351 --> 00:14:26,915 the probability that each of the log base 2N iterations of the outer for loop fail 210 00:14:26,915 --> 00:14:33,094 is then at most 1/2^Log base 2 of N, also known as 1/N, so the overall success 211 00:14:33,094 --> 00:14:37,748 probability of the algorithm is at least 1-1/N as claimed.