1 00:00:03,580 --> 00:00:09,203 So this brings us up to what we talked about right at the end of class last time 2 00:00:09,203 --> 00:00:14,965 which was looking at an example where you let's say have a sequentially consistent 3 00:00:14,965 --> 00:00:18,217 processor. And, we take a piece of code, that we had 4 00:00:18,217 --> 00:00:21,730 talked about at the beginning of class last time, and ask ourselves. 5 00:00:21,730 --> 00:00:25,610 Does this still work. And what we're going to do is instead of 6 00:00:25,610 --> 00:00:30,543 having one producer and one consumer, we're going to have one producer and two 7 00:00:30,543 --> 00:00:35,147 consumers, or multiple consumers. So as shown here, we have one producer. 8 00:00:35,147 --> 00:00:38,435 It has a register where it keeps the tail pointer. 9 00:00:38,435 --> 00:00:41,987 There's, there's two different pointers here in memory. 10 00:00:41,987 --> 00:00:45,341 The actual data storage of the queue, two consumers. 11 00:00:45,341 --> 00:00:49,287 They have their own register sets, 'because they're two different 12 00:00:49,287 --> 00:00:52,444 processors, or two different complete, threads. 13 00:00:52,444 --> 00:00:55,612 So they have registers which they cache effectively. 14 00:00:55,612 --> 00:01:00,297 It's not a cache, but they effectively have to load, load it into their register 15 00:01:00,297 --> 00:01:02,966 space. And then they actually have where they 16 00:01:02,966 --> 00:01:07,680 will load the actual value that they pop off the ends of the queue. 17 00:01:07,680 --> 00:01:11,600 So someone I think said this at the end of class here. 18 00:01:11,600 --> 00:01:17,190 But if you're going to be executing the producer, this piece of producer code, 19 00:01:17,190 --> 00:01:23,070 and you're going to execute this consumer code does anyone see a problem here of 20 00:01:23,070 --> 00:01:28,515 something that can happen that we probably don't want for a first in, first 21 00:01:28,515 --> 00:01:33,597 out queue with multiple readers. Yeah so let's, let's walk through that. 22 00:01:33,597 --> 00:01:38,796 Let's say two consumers, our execute, we, we interweave two 23 00:01:38,796 --> 00:01:45,729 consumers executing the same instruction, from one consumer thread and the other 24 00:01:45,729 --> 00:01:51,910 consumer thread every other cycle. So we have, the first consumer will read 25 00:01:51,910 --> 00:01:56,924 the head pointer into the head. The second consumer will read the head 26 00:01:56,924 --> 00:02:03,505 pointer into its own head register. This consumer here will read the tail 27 00:02:03,505 --> 00:02:08,707 pointer into, the tail register. The second consumer will do that same 28 00:02:08,707 --> 00:02:12,525 thing. So all of a sudden the two consumers read 29 00:02:12,525 --> 00:02:16,970 the same heads and the same tail. And this is sequentially consistent 30 00:02:16,970 --> 00:02:21,543 because we've not done run, done no reordering between within the thread. 31 00:02:21,543 --> 00:02:26,170 we've just done some valid interlead. Now they both check this. 32 00:02:26,170 --> 00:02:29,633 This is the, the case which is basically saying is there anything in the queue. 33 00:02:29,633 --> 00:02:32,520 Does the head equal the tail? If the head equals the tail there's 34 00:02:32,520 --> 00:02:35,184 nothing in the queue. If there's a distance between the head 35 00:02:35,184 --> 00:02:39,064 and the tail that means there's some entries in the queue So they both fall 36 00:02:39,064 --> 00:02:42,980 through an they say, well yeah. There's, there's stuff in the queue. 37 00:02:42,980 --> 00:02:45,874 The next thing they do is they go read from the head. 38 00:02:45,874 --> 00:02:48,332 This is the value that they are going to use. 39 00:02:48,332 --> 00:02:50,900 Well they both just read the same head pointer. 40 00:02:53,140 --> 00:02:58,289 Because we are going to execute from one thread and we're going to load that value 41 00:02:58,289 --> 00:03:02,216 into another thread. So all of a sudden we just read the same 42 00:03:02,216 --> 00:03:07,430 value twice and one is the one thread and one is the other thread but it was the 43 00:03:07,430 --> 00:03:12,322 same, same value that we get out of reading from the, the pointer that points 44 00:03:12,322 --> 00:03:15,300 to the, the, the head if you will. Now. 45 00:03:15,300 --> 00:03:20,105 In a perfect world, that would be the only bad thing that could happen in this 46 00:03:20,105 --> 00:03:24,542 case because we would incorrect, they'd both incorrect the headpointers, then, 47 00:03:24,542 --> 00:03:29,656 then they do stores here and effectively what would happen is they would write the 48 00:03:29,656 --> 00:03:34,831 same head location into the, the, or the same pointer if you will into, into the 49 00:03:34,831 --> 00:03:37,610 head. That's not so bad. 50 00:03:37,610 --> 00:03:41,251 But effectively what we did is we enqueued one thing into the queue, and 51 00:03:41,251 --> 00:03:45,399 then we got two things out of the queue. This breaks our semantics of what a queue 52 00:03:45,399 --> 00:03:50,298 should actually do. Something that could happen that's much 53 00:03:50,298 --> 00:03:53,649 worse here. Is that one of the threads, let's say 54 00:03:53,649 --> 00:03:58,557 another valid interweaving, is the one thread just stalls for a long time in 55 00:03:58,557 --> 00:04:03,910 here. And, like a billion cycles, it just falls 56 00:04:03,910 --> 00:04:08,182 for a billion cycles. It takes really bad cachmus or it traps 57 00:04:08,182 --> 00:04:11,987 into the operating system, or something bad happens to it. 58 00:04:11,987 --> 00:04:17,160 What can happen is before the update at the headpointer happens, 59 00:04:17,160 --> 00:04:19,780 or what could, what could have happened here 60 00:04:19,780 --> 00:04:23,056 is you could have something really strange happen. 61 00:04:23,056 --> 00:04:27,970 Or, or, you could basically have the other thread execute, and it could pop a 62 00:04:27,970 --> 00:04:32,163 whole bunch of values off the head of the queue in the meantime. 63 00:04:32,163 --> 00:04:35,242 So let's say there was 100 things in the queue. 64 00:04:35,242 --> 00:04:38,840 We both execute into here at the same time. 65 00:04:38,840 --> 00:04:42,540 One of the threads get stalled here for a long period of time. 66 00:04:42,540 --> 00:04:45,345 The other thread pops 100 things off the queue. 67 00:04:45,345 --> 00:04:49,582 But then finally, this one here goes and updates the head pointer again. 68 00:04:49,582 --> 00:04:54,237 So what's going to happen, is effectively you are basically going to reset the 69 00:04:54,237 --> 00:04:56,922 queue and put 100 things back onto the queue. 70 00:04:56,922 --> 00:05:01,160 And, it's probably going to be garbage in memory, in the queue at that time. 71 00:05:03,180 --> 00:05:10,464 So this brings up the idea that we may not want to be concurrently executing 72 00:05:10,464 --> 00:05:14,974 even the sequentially, even the strict sequentially consistent model. 73 00:05:14,974 --> 00:05:20,213 This block of code and another threads piece of code which is executing the 74 00:05:20,213 --> 00:05:25,121 same, same thing, the, the two consumers. So we're going to call this a critical 75 00:05:25,121 --> 00:05:28,503 section. if you take an operating system's class 76 00:05:28,503 --> 00:05:33,477 you will seen this, seen this a bunch. And the idea here is that you want to 77 00:05:33,477 --> 00:05:36,860 atomically execute from all the consumers possible. 78 00:05:36,860 --> 00:05:40,560 This block of code here. So no two consumers could begin to 79 00:05:40,560 --> 00:05:43,320 execute that block of code at the same time. 80 00:05:44,700 --> 00:05:51,096 And one way to do that is you have a lock on that piece of code. 81 00:05:51,096 --> 00:05:58,224 So you grab a lock, you can go execute that piece of code, and then you release 82 00:05:58,224 --> 00:06:00,600 the lock. Questions so far? 83 00:06:03,360 --> 00:06:11,865 Okay, so a little bit more general notion of the same idea of a lock is called a 84 00:06:11,865 --> 00:06:16,543 semaphore. And if you've taken !!! class you've 85 00:06:16,543 --> 00:06:23,347 probably seen these strange nomenclatures of P and V semaphores. 86 00:06:23,347 --> 00:06:31,533 And if you want to go and look up where it actually comes from, it's Dutch. 87 00:06:31,533 --> 00:06:37,639 And P largely means attempt to acquire the semaphore. 88 00:06:37,639 --> 00:06:41,712 But, in, in, in Dutch it means I don't speak. 89 00:06:41,712 --> 00:06:45,243 Does anyone here speak Dutch? By chance? 90 00:06:45,243 --> 00:06:49,317 Anyone speak anything close to Dutch [LAUGH]? 91 00:06:49,317 --> 00:06:50,947 You do? Little bit? 92 00:06:50,947 --> 00:06:54,930 Okay, how do we, how do we, how do we say that? 93 00:06:54,930 --> 00:07:02,784 What, what is close to Dutch actually? German's close to Dutch? 94 00:07:02,784 --> 00:07:03,669 Yeah. Okay. 95 00:07:03,669 --> 00:07:08,094 I guess it's. so I don't, I don't speak Dutch or 96 00:07:08,094 --> 00:07:15,260 German, but it means try to reduce. In a semaphore you can have multiple, 97 00:07:15,260 --> 00:07:19,630 people trying to grab a lock at the same time and multiple people trying to 98 00:07:19,630 --> 00:07:22,217 caching to aquire that lock at the same time. 99 00:07:22,217 --> 00:07:24,805 So an example, or aquire for the same time. 100 00:07:24,805 --> 00:07:29,117 The example that we gave in the last class is that you have multiple users 101 00:07:29,117 --> 00:07:33,717 trying to use one I/O device, and let's say that there are two queues in the I/O 102 00:07:33,717 --> 00:07:36,989 device. You have 100 different processors or 103 00:07:36,989 --> 00:07:39,792 threads trying to use that one I/O device. 104 00:07:39,792 --> 00:07:45,197 You can actually have strictly only two acquire the I/O device, but not more than 105 00:07:45,197 --> 00:07:48,067 two. In a Mutex, you're only allowed to have 106 00:07:48,067 --> 00:07:53,406 one thing be able to acquire the critical section or acquire the lock at a time. 107 00:07:53,406 --> 00:07:57,276 And, hence, we have a little bit more general term for this. 108 00:07:57,276 --> 00:08:00,780 A P semaphore basically, decrements a counter. 109 00:08:00,780 --> 00:08:06,227 So we start off with a counter S, here. We put some number into it, 110 00:08:06,227 --> 00:08:12,178 and then we do P on that, it'll actually decrease the counter, and 111 00:08:12,178 --> 00:08:18,580 if the counter was greater than zero, we can go execute our Kregel section. 112 00:08:19,840 --> 00:08:23,689 Then we have the release for the V semaphore. 113 00:08:23,689 --> 00:08:30,191 And that will actually increment this counter by one and it'll wake up some 114 00:08:30,191 --> 00:08:35,410 other process if there is some other process waiting on that. 115 00:08:35,410 --> 00:08:42,340 if, if, if, I said dropped below, one it'll go wake up another process [COUGH]. 116 00:08:42,340 --> 00:08:46,222 And, one of the important things here is that these, both these P and these V 117 00:08:46,222 --> 00:08:48,725 semaphores must be atomic, relative to each other. 118 00:08:48,725 --> 00:08:52,658 So however you go, thinking about going to implement them, they must, they must 119 00:08:52,658 --> 00:08:56,131 be relative to each other. And this is, with respect to everything 120 00:08:56,131 --> 00:08:58,736 happening, everything else happening on your system. 121 00:08:58,736 --> 00:09:01,597 Otherwise, really, really bad things will start to happen. 122 00:09:01,597 --> 00:09:05,220 You'll actually, have multiple people grabbing the lock. 123 00:09:05,220 --> 00:09:10,595 And how you would typically use this is inside of your thread or your process. 124 00:09:10,595 --> 00:09:16,246 You're going to put your critical section in, put the acquire here, and the release 125 00:09:16,246 --> 00:09:20,681 there. And one of the important things that, 126 00:09:20,681 --> 00:09:25,458 here to note is these semaphores. You can have the value S here determine 127 00:09:25,458 --> 00:09:30,512 how many the, the initial value of S rather determines how many different 128 00:09:30,512 --> 00:09:35,704 threads or processes are able to enter the curricular section concurrently. 129 00:09:35,704 --> 00:09:41,520 If S is initially such one that means one only one thread is allowed to go in there 130 00:09:41,520 --> 00:09:46,296 and that's what's called a Mutex. so it's mutually exclusive, only one 131 00:09:46,296 --> 00:09:49,966 thread can go into a curricular section at that time. 132 00:09:49,966 --> 00:09:50,520