1 00:00:00,012 --> 00:00:05,201 Understanding why Papadimitrious's randomized local search algorithm for two 2 00:00:05,201 --> 00:00:10,319 set works requires understanding some basic properties of random walks on the 3 00:00:10,319 --> 00:00:13,927 non-negative integers. It's a very fun classic topic. 4 00:00:13,927 --> 00:00:17,432 In this video I'll provide the relevant background. 5 00:00:17,432 --> 00:00:22,430 In this video, we're going to focus on a simple randomized process. 6 00:00:22,430 --> 00:00:28,150 We're not going to discuss in this video how this connects to Papadimitriou's 7 00:00:28,150 --> 00:00:32,201 algorithm. But in the next video, we will, trust me. 8 00:00:32,201 --> 00:00:36,448 The setup is as follows. Perhaps you just took a super difficult 9 00:00:36,448 --> 00:00:39,288 exam. Maybe it was in algorithm class, for 10 00:00:39,288 --> 00:00:42,276 example. You stagger out of the exam room and 11 00:00:42,276 --> 00:00:46,820 you're just in a complete daze. You have no idea where you are or were 12 00:00:46,820 --> 00:00:49,240 your going. You just stumble along. 13 00:00:49,240 --> 00:00:53,882 At each time step, you take a step to the left with 50% probability. 14 00:00:53,882 --> 00:00:57,905 Where a step to the right with 50% probability, you have no idea where 15 00:00:57,905 --> 00:01:01,060 you're going. We measure your position in terms of the 16 00:01:01,060 --> 00:01:05,542 number of steps you've taken from a dead end at the end of the street, which we 17 00:01:05,542 --> 00:01:08,786 call position 0. So by randomly walking left or right at 18 00:01:08,786 --> 00:01:12,820 each time step, your position either goes up by 1 or it goes down by one. 19 00:01:12,820 --> 00:01:17,548 Each has a 50 50 chance of happening. So one exception is if you found your way 20 00:01:17,548 --> 00:01:20,821 to position 0. Again that's a dead end, there's no way 21 00:01:20,821 --> 00:01:23,843 to go left. So with 100% probability, you stagger 22 00:01:23,843 --> 00:01:27,141 back to the right. You go back to position one at the next 23 00:01:27,141 --> 00:01:30,230 time step. We assume that your exam room is located 24 00:01:30,230 --> 00:01:34,432 at this position 0 at the dead end. In other words, at time 0 you're at 25 00:01:34,432 --> 00:01:37,244 position 0. There's a lot of interesting questions 26 00:01:37,244 --> 00:01:40,273 you can ask about random walks on the non-negative integers. 27 00:01:40,273 --> 00:01:42,496 Here's the one we're going to be interested in. 28 00:01:42,496 --> 00:01:46,294 Imagine your dorm room is n steps away from the exam room, that is, your dorm 29 00:01:46,294 --> 00:01:50,354 room is at position [INAUDIBLE] n. Now if you had your wits about you, you 30 00:01:50,354 --> 00:01:53,648 could go from your exam room to your dorm room in n steps. 31 00:01:53,648 --> 00:01:57,849 You just zip right down the line. But you don't have your wits about you. 32 00:01:57,849 --> 00:02:00,826 You're in this post-exam daze, staggering around. 33 00:02:00,826 --> 00:02:05,411 So doing your random walk, on average how many steps does it take you before you 34 00:02:05,411 --> 00:02:09,136 make it home? To position N. To illustrate a sample trajectory, 35 00:02:09,136 --> 00:02:12,201 imagine that your dorm room was only three steps away. 36 00:02:12,201 --> 00:02:15,267 That's spot 3. So in time 1, we know that for sure, you 37 00:02:15,267 --> 00:02:19,201 move from position 0 to position 1. And now maybe you get lucky and you 38 00:02:19,201 --> 00:02:23,458 stumble a step to the right, in the next time step, so you're at position 2. 39 00:02:23,458 --> 00:02:27,615 You're only one step away from home. But now, unfortunately, you stagger 40 00:02:27,615 --> 00:02:31,824 backward, back to position one. Maybe now you lurch forward back to 41 00:02:31,824 --> 00:02:36,700 position two again and this time your momentum carries you forward on home to 42 00:02:36,700 --> 00:02:39,968 position 3. That was a trajectory in which it took 43 00:02:39,968 --> 00:02:43,530 you 5 steps to get home to your dorm room at position three. 44 00:02:43,530 --> 00:02:48,962 You can imagine a trajectory which is faster if you just zipped straight there. 45 00:02:48,962 --> 00:02:53,508 There, you can certainly also imagine trajectories which are slower if you 46 00:02:53,508 --> 00:02:58,263 stumbled back left a couple times more before making it all the way to the right 47 00:02:58,263 --> 00:03:01,863 to position three. So let's state this statistic precisely 48 00:03:01,863 --> 00:03:06,122 and check in with your intuition about how it grows as a function of n. 49 00:03:06,122 --> 00:03:11,495 So, for a non-negative integer, n, I'm going to use T sub n to denote the number 50 00:03:11,495 --> 00:03:17,317 of steps this random walk takes before it reaches, for the first time, position n. 51 00:03:17,317 --> 00:03:22,823 So, T sub n is a random variable in the sense that it's value depends on exactly 52 00:03:22,823 --> 00:03:27,177 which steps you took on the results of your random coin flips. 53 00:03:27,177 --> 00:03:32,579 On the other hand, once you unveil the answers to all of the random coin flips 54 00:03:32,579 --> 00:03:37,392 whether you go left Left to right, at each time step you can compute T, just 55 00:03:37,392 --> 00:03:41,743 like we did in the sample trajectory on the last slide.So the non-trivial 56 00:03:41,743 --> 00:03:46,392 question I want you to think about on this quiz is, how does the expected value 57 00:03:46,392 --> 00:03:51,367 of T saban/g grow as a function of the position And that you're waiting to 58 00:03:51,367 --> 00:03:54,947 reach. I'm not expecting to devise a proof to 59 00:03:54,947 --> 00:04:00,927 the asnswer, indeed that's what we're going to be doing for most of the rest of 60 00:04:00,927 --> 00:04:05,097 this video. Just asking you to take your best guess. 61 00:04:05,097 --> 00:04:08,492 And so the correct answer is B, theta n^2. 62 00:04:08,492 --> 00:04:12,267 Depending on how much facility you have with discrete probability, you may or may 63 00:04:12,267 --> 00:04:15,267 not have been able to do a back-of-the-envelope calculation that 64 00:04:15,267 --> 00:04:18,667 would suggest this should be the answer. So for example, if you take some 65 00:04:18,667 --> 00:04:22,217 Bernoulli random variables and do some calculations with their standard 66 00:04:22,217 --> 00:04:24,892 deviation, you might expect the answer to be theta(n^2). 67 00:04:24,892 --> 00:04:28,217 In fact what's even cooler and what we're going to prove is it's not just 68 00:04:28,217 --> 00:04:32,687 theta(n^2), the expected value of The variant random variable T sub N is 69 00:04:32,687 --> 00:04:36,212 exactly N squared for every non-negative integer N. 70 00:04:36,212 --> 00:04:41,487 For the most part in these courses, I've endeavored to give you proofs which on 71 00:04:41,487 --> 00:04:46,512 the one hand aren't too long, I know you're a busy person and I don't want to 72 00:04:46,512 --> 00:04:50,837 waste your time, but also on the other hand teach you something. 73 00:04:50,837 --> 00:04:55,737 So I hope you'll forgive me if it's not clear what this proof teaches you. 74 00:04:55,737 --> 00:05:01,169 This is just going to be the slicked-up [INAUDIBLE] proof from the book that the 75 00:05:01,169 --> 00:05:06,145 expected value of T sub n, number of steps needed to reach the position n on a 76 00:05:06,145 --> 00:05:10,875 random walk is exactly m squared. The really nice idea behind this slick 77 00:05:10,875 --> 00:05:16,239 proof is to solve a more general problem. So we're going to calculate the expected 78 00:05:16,239 --> 00:05:20,971 number of steps, not just to get from position 0 to position n on a random 79 00:05:20,971 --> 00:05:23,317 walk. But more generally from every 80 00:05:23,317 --> 00:05:26,655 intermediate position I to the position N on a random walk. 81 00:05:26,655 --> 00:05:30,936 The reason this is a good idea, this gives us N+1 different statistics we're 82 00:05:30,936 --> 00:05:34,212 trying to calculate. And it's easy to relate the values of 83 00:05:34,212 --> 00:05:37,140 these differents things we're trying to calculate. 84 00:05:37,140 --> 00:05:41,712 From these relationships we'll be able to solve for all of them simultaneously. 85 00:05:41,712 --> 00:05:47,509 So, let's have some notation to make this precise for a non-negative integer i, by 86 00:05:47,509 --> 00:05:52,932 Z sub i, I mean the number of random walk steps that you take if I start you at i 87 00:05:52,932 --> 00:05:56,184 until you reach position n for the first time. 88 00:05:56,184 --> 00:06:01,137 Notice by the definitions, Z sub zero is exactly the same thing as t sub n. 89 00:06:01,137 --> 00:06:05,017 The number of steps you need to reach n starting from 0. 90 00:06:05,017 --> 00:06:10,628 So, are there any Zi's whose expectations are easy to compute? Well, sure, if we 91 00:06:10,628 --> 00:06:14,721 took i=n, what is z sub n. That's the, number of steps, on average, 92 00:06:14,721 --> 00:06:17,550 you need to get from n to n, for the first time. 93 00:06:17,550 --> 00:06:21,141 Well that's going to be zero. For the other values of i, we're not 94 00:06:21,141 --> 00:06:25,109 going to solve for the expected value of z sub i explicitly just yet. 95 00:06:25,109 --> 00:06:29,362 Rather, we're going to relate the expectations of different zi's to each 96 00:06:29,362 --> 00:06:32,544 other. To see how this works lets start with i 97 00:06:32,544 --> 00:06:37,052 equals zero that is Z0 the random variable that we really care about. 98 00:06:37,052 --> 00:06:42,149 Now we don't know what the expected value of Z0 is that's is we're trying to 99 00:06:42,149 --> 00:06:47,348 compute but we do know that every walk starting at 0 and you get n begins with 100 00:06:47,348 --> 00:06:50,522 the step from 0 to 1 and then is followed by. 101 00:06:50,522 --> 00:06:54,491 A walk from 1 to N. Therefore, the expected length of one of 102 00:06:54,491 --> 00:06:59,509 these walks is just 1, that's for that first hop to get from 0 to 1 plus the 103 00:06:59,509 --> 00:07:03,120 expected value of a random walk from position 1 to N. 104 00:07:03,120 --> 00:07:07,482 That's a quantity also known as the expected value of Z sub 1. 105 00:07:07,482 --> 00:07:12,407 So that's kind of exciting, how we were able to avoid explicitly computing the 106 00:07:12,407 --> 00:07:16,772 expected value of z sub 0, instead relating it to the expectation of a 107 00:07:16,772 --> 00:07:21,232 different random variable z sub 1. But if you think about it we can play 108 00:07:21,232 --> 00:07:25,337 exactly the same game with the intermediate values of pi as well. 109 00:07:25,337 --> 00:07:30,362 We can relate the expected value of z sub i to the expected values of z sub i-1 and 110 00:07:30,362 --> 00:07:33,663 z sub i+1. To make this fully precise, I'm going to 111 00:07:33,663 --> 00:07:36,923 bust out the definition of conditional expectation. 112 00:07:36,923 --> 00:07:41,342 If you want to review it, you can go back to part 1, we need a conditional 113 00:07:41,342 --> 00:07:44,921 probability to analyze, [UNKNOWN] contraction algorithm or you 114 00:07:44,921 --> 00:07:47,243 can just look it up on Wikipedia or whatever. 115 00:07:47,243 --> 00:07:51,202 If you are not feeling in the mood to recall what conditional expectation is, 116 00:07:51,202 --> 00:07:54,669 actually think the computations them about to do or sufficiently intuitive, 117 00:07:54,669 --> 00:07:58,962 you will find them eminently plausible even without the former justification. 118 00:07:58,962 --> 00:08:03,508 So what's the expected value of Z sub i, the average number of steps you need to 119 00:08:03,508 --> 00:08:06,763 hit n for the first time if I start you out at position i. 120 00:08:06,763 --> 00:08:10,416 Well, let's just condition on what happened in the first step. 121 00:08:10,416 --> 00:08:14,482 There are only 2 possibilities. 50% probability you go left to i-1. 122 00:08:14,482 --> 00:08:18,674 The rest of what happens is a random walk from i-1 and you just count the steps 123 00:08:18,674 --> 00:08:22,807 till you get the end for the first time. The other 50% of the time where you go 124 00:08:22,807 --> 00:08:27,117 right and the rest of the process is just a random walk starting from i+1 and you 125 00:08:27,117 --> 00:08:30,097 count the steps till you get the end for the first time. 126 00:08:30,097 --> 00:08:33,894 If you want to be totally formal about it, you would just expand that the 127 00:08:33,894 --> 00:08:37,952 expected value of z sub i conditioning on the first step, so you'll have the 128 00:08:37,952 --> 00:08:42,435 probability when you go left times the Conditional expectation of Z sub I given 129 00:08:42,435 --> 00:08:46,605 that you go left in the first step and a similar term for when you go right. 130 00:08:46,605 --> 00:08:50,725 As we discussed, we know the probability that you go left is exactly 1/2. 131 00:08:50,725 --> 00:08:53,517 The probability that you go right is exactly 1/2. 132 00:08:53,517 --> 00:08:57,835 The conditional expected value of your random walk, given that you went left 133 00:08:57,835 --> 00:09:01,546 first, well that's just 1. That's the step you took to get to I-1, 134 00:09:01,546 --> 00:09:05,138 plus the expected length of a random walk to N from I-1. 135 00:09:05,138 --> 00:09:10,290 Similarly, if you're lucky enough to go right, then the expected random walk is 136 00:09:10,290 --> 00:09:15,301 just 1, that's for the step to go right, plus the expected value of the rest of 137 00:09:15,301 --> 00:09:19,492 the random walk, also known as the expected value of Z sub (i+1). 138 00:09:19,492 --> 00:09:24,165 Once the dust clears, we discover that the expected value of Z sub i, the 139 00:09:24,165 --> 00:09:27,612 average number of steps from position i, is just 1. 140 00:09:27,612 --> 00:09:32,655 Plus the average of the expectations of Z sub I minus 1, and Z sub I plus 1. 141 00:09:32,655 --> 00:09:37,675 If we multiply both sides of this equality by 2 and rearrange, we get that 142 00:09:37,675 --> 00:09:42,643 the difference between the expected values of Z sub I and Z sub I plus 1 is 143 00:09:42,643 --> 00:09:47,638 equal to 2 more than the difference between the expected values of Z sub I 144 00:09:47,638 --> 00:09:51,457 minus 1 and Z sub I. So what's going to interpretation of this 145 00:09:51,457 --> 00:09:55,977 equation? Well first let's just notice that the expected value of z sub i, of 146 00:09:55,977 --> 00:10:00,072 that number is definitely decreasing as you take i bigger and bigger. 147 00:10:00,072 --> 00:10:04,607 As you start closer and closer to your destination n, certainly the number of 148 00:10:04,607 --> 00:10:08,257 steps you need is going down. So without reason both sides of this 149 00:10:08,257 --> 00:10:12,079 equation are going to be positive. Write the expected value of Z sub i is 150 00:10:12,079 --> 00:10:16,009 going to be bigger than z sub i+1. You don't need as many steps if you start 151 00:10:16,009 --> 00:10:19,773 further to the right at i+1. This equation is saying more, its saying, 152 00:10:19,773 --> 00:10:22,956 consider a consecutive pair of conceivable starting point. 153 00:10:22,956 --> 00:10:26,305 Clearly you have an advantage by starting one position closer. 154 00:10:26,305 --> 00:10:30,657 This equation is saying that as you slide this pair of consecutive starting points 155 00:10:30,657 --> 00:10:34,369 closer to the destination the advantage gets amplified. 156 00:10:34,369 --> 00:10:40,081 So, so however much savings you got by starting 1 position closer at 1 instead 157 00:10:40,081 --> 00:10:45,345 of I-1, you get that same advantage plus two more if you get to start at I+1 158 00:10:45,345 --> 00:10:49,540 relative to starting at I. So why was it useful to relate all of 159 00:10:49,540 --> 00:10:54,563 these various expectations to each other? Well now that we have this big systems of 160 00:10:54,563 --> 00:10:59,511 constraints, we can solve simultaneously for what all of these expectations have 161 00:10:59,511 --> 00:11:02,441 to be. In particular, one expectation we really 162 00:11:02,441 --> 00:11:06,092 care about, e of z naught. So to see how we do this, let's just 163 00:11:06,092 --> 00:11:10,402 write down what the difference is between success and expectations have to be. 164 00:11:10,402 --> 00:11:14,572 So the first thing we know and this is one of our edge cases, is that whatever 165 00:11:14,572 --> 00:11:18,742 the expected value of Z not is it has to be one more than the expected value of 166 00:11:18,742 --> 00:11:21,092 Z1. That is the difference between these two 167 00:11:21,092 --> 00:11:24,517 expectations equals 1. We concluded the previous slide by noting 168 00:11:24,517 --> 00:11:29,091 that if you slide a pair of consecutive positions, one position further toward 169 00:11:29,091 --> 00:11:33,882 the destination n, then the advantage you get by starting one position closer goes 170 00:11:33,882 --> 00:11:36,920 up by two. So, if your advantage is one, by starting 171 00:11:36,920 --> 00:11:41,405 at z1 instead of z9 Then your advantage is going to be three by starting at Z two 172 00:11:41,405 --> 00:11:44,421 instead of Z one. That is, the expected value of Z one 173 00:11:44,421 --> 00:11:48,541 minus the expected number of steps from two is going to be equal to three. 174 00:11:48,541 --> 00:11:51,631 So we can just apply that equation over and over again. 175 00:11:51,631 --> 00:11:56,036 We bump up the indices by one, and the difference between the expectation gets 176 00:11:56,036 --> 00:12:00,015 bumped up by two. So at the end of the day, relative to our 177 00:12:00,015 --> 00:12:05,600 very first equation we've bumped up both indexes by n-1, so we've bumped up the 178 00:12:05,600 --> 00:12:10,707 right hand side by 2n-2, so we have that the expected value of zn-1 minus the 179 00:12:10,707 --> 00:12:15,101 expected value of zn=2n-1. Massive cancellation is often the sign 180 00:12:15,101 --> 00:12:18,282 you're on the right track so that's what we're going to see here. 181 00:12:18,282 --> 00:12:22,210 So all of the expected values for z1 throught zn-1 appears once positively, 182 00:12:22,210 --> 00:12:24,644 once negatively so they all are going to drop out. 183 00:12:24,644 --> 00:12:28,442 And actually gets even better than that, the expected value of z sub n. 184 00:12:28,442 --> 00:12:31,339 That's just 0. Remember that would be how longer random 185 00:12:31,339 --> 00:12:33,969 walk takes to get from end to end, also known as 0. 186 00:12:33,969 --> 00:12:38,074 So if we add up all these equations we magically get exactly what he heard about 187 00:12:38,074 --> 00:12:41,887 in left hand side the expected value of z0 of a walks turning at 0 and the n 188 00:12:41,887 --> 00:12:45,702 that's also known as T sub n, the expectation we're trying to analyse. 189 00:12:45,702 --> 00:12:50,455 So what is the right hand side sum 1 easy way to think about it is just to group 190 00:12:50,455 --> 00:12:54,926 the first row with the last row the second row with the second to the last 191 00:12:54,926 --> 00:12:58,105 row and so on. For example of n is equal to 100 we're 192 00:12:58,105 --> 00:13:01,742 going to group 1 with 199, 3 with 197, 5 with 195, and so on. 193 00:13:01,742 --> 00:13:06,392 So each of these groupings is going to gives us 2N, and if N is even there's 194 00:13:06,392 --> 00:13:09,752 going to be N/2 of these pairs, giving us a total of N^2. 195 00:13:09,752 --> 00:13:14,452 If N is odd, the story is the same you just get N-1/2 groups of size 2N, plus 196 00:13:14,452 --> 00:13:17,317 the median group is going to have the value of N. 197 00:13:17,317 --> 00:13:21,306 Again, you get N^2. So that completes this very pretty if 198 00:13:21,306 --> 00:13:26,766 some what inscrutable proof that the expected number of steps you need before 199 00:13:26,766 --> 00:13:30,384 you reach n for the first time is exactly n squared. 200 00:13:30,384 --> 00:13:35,819 In the analysis of Papadimitriou's algorithm we're not actually going to use 201 00:13:35,819 --> 00:13:41,200 this claim about the expectation rather we're going to use an easy corollary 202 00:13:41,200 --> 00:13:45,012 which gives an upper bound of the probablility that. 203 00:13:45,012 --> 00:13:48,394 This exceeds its expectation by more than a factor of 2. 204 00:13:48,394 --> 00:13:53,166 Specifically, we're going to use the fact that the probability that a random walk 205 00:13:53,166 --> 00:13:58,038 needs strictly more than 2*n^2 steps tor each position n for the first time is no 206 00:13:58,038 --> 00:14:01,177 more than 50%. This is a special case of a simple but 207 00:14:01,177 --> 00:14:04,192 useful inequality known as Markov's Inequality. 208 00:14:04,192 --> 00:14:08,897 In proof let's denote by p the probability of interest, the probability 209 00:14:08,897 --> 00:14:12,603 that the random variable Tn is strictly bigger than 2n^. 210 00:14:12,603 --> 00:14:17,771 Essentially the reason the corollary is true is that if the probability that Tn 211 00:14:17,771 --> 00:14:23,110 was bigger than 2n^ was strictly bigger than 1/2, well then the expectation would 212 00:14:23,110 --> 00:14:26,649 have to be bigger than n squared, but it's not it's eqauls to n squared, we 213 00:14:26,649 --> 00:14:29,972 just computed it. So a little more formally, let's start 214 00:14:29,972 --> 00:14:34,107 from the expected value of t sub n. Now we know this is n^2, we just computed 215 00:14:34,107 --> 00:14:37,817 it, but let's also write out the definition of this expected value. 216 00:14:37,817 --> 00:14:42,252 So that's just a sum over the values that tn might take on, let's go ahead and just 217 00:14:42,252 --> 00:14:46,687 use all, all negative integers k from 0 to infinity, and then it's k times the 218 00:14:46,687 --> 00:14:49,412 probability that t sub n takes on the value k. 219 00:14:49,412 --> 00:14:51,815 I'm going to break this sum into two parts. 220 00:14:51,815 --> 00:14:56,514 The parts where k takes on a value that's at most 2n^2, and then the parts of the 221 00:14:56,514 --> 00:14:59,231 sum where k takes on a value bigger than 2n^2. 222 00:14:59,231 --> 00:15:03,657 So now let me just do some very crude lower bounds of this right-hand side. 223 00:15:03,657 --> 00:15:07,742 This first sum, well, whatever it is, it's going to be non-negative. 224 00:15:07,742 --> 00:15:11,851 Because ks are non negative and probabilities are non negative. 225 00:15:11,851 --> 00:15:16,795 And the second sum, again I'm feeling very generous, let's just lower bound k 226 00:15:16,795 --> 00:15:20,482 in the sum by 2n^2, it's always going to be bigger than that. 227 00:15:20,482 --> 00:15:25,209 After the dust clears the first sum has disappeared that's what we're just 228 00:15:25,209 --> 00:15:29,239 lower-bounding by 0. In the second sum we're lowering K by 2n 229 00:15:29,239 --> 00:15:33,114 squared at that point. We can take the 2n squared outside the 230 00:15:33,114 --> 00:15:38,129 sum and at that point the sum is merely the total probability masked on which Tn 231 00:15:38,129 --> 00:15:41,052 takes on a value strictly bigger than 2n^2+1. 232 00:15:41,052 --> 00:15:44,355 By definition we are calling this probability of P. 233 00:15:44,355 --> 00:15:49,324 This is the probability of interest. And rearranging we do see that, in fact, 234 00:15:49,324 --> 00:15:53,725 P has to be at most one-half [INAUDIBLE] So that completes the claim of the 235 00:15:53,725 --> 00:15:56,458 corollary. It's really just a simple consequence of 236 00:15:56,458 --> 00:16:00,481 the hard work which was proving that the expected value of this random log was 237 00:16:00,481 --> 00:16:02,562 n^2. We'll plug this corollary into the 238 00:16:02,562 --> 00:16:04,898 analyisis of Papademetriou's algorithm next.