1
00:00:03,580 --> 00:00:09,203
So this brings us up to what we talked 
about right at the end of class last time 

2
00:00:09,203 --> 00:00:14,965
which was looking at an example where you 
let's say have a sequentially consistent 

3
00:00:14,965 --> 00:00:18,217
processor. 
And, we take a piece of code, that we had 

4
00:00:18,217 --> 00:00:21,730
talked about at the beginning of class 
last time, and ask ourselves. 

5
00:00:21,730 --> 00:00:25,610
Does this still work. 
And what we're going to do is instead of 

6
00:00:25,610 --> 00:00:30,543
having one producer and one consumer, 
we're going to have one producer and two 

7
00:00:30,543 --> 00:00:35,147
consumers, or multiple consumers. 
So as shown here, we have one producer. 

8
00:00:35,147 --> 00:00:38,435
It has a register where it keeps the tail 
pointer. 

9
00:00:38,435 --> 00:00:41,987
There's, there's two different pointers 
here in memory. 

10
00:00:41,987 --> 00:00:45,341
The actual data storage of the queue, 
two consumers. 

11
00:00:45,341 --> 00:00:49,287
They have their own register sets, 
'because they're two different 

12
00:00:49,287 --> 00:00:52,444
processors, or two different complete, 
threads. 

13
00:00:52,444 --> 00:00:55,612
So they have registers which they cache 
effectively. 

14
00:00:55,612 --> 00:01:00,297
It's not a cache, but they effectively 
have to load, load it into their register 

15
00:01:00,297 --> 00:01:02,966
space. 
And then they actually have where they 

16
00:01:02,966 --> 00:01:07,680
will load the actual value that they pop 
off the ends of the queue. 

17
00:01:07,680 --> 00:01:11,600
So someone I think said this at the end 
of class here. 

18
00:01:11,600 --> 00:01:17,190
But if you're going to be executing the 
producer, this piece of producer code, 

19
00:01:17,190 --> 00:01:23,070
and you're going to execute this consumer 
code does anyone see a problem here of 

20
00:01:23,070 --> 00:01:28,515
something that can happen that we 
probably don't want for a first in, first 

21
00:01:28,515 --> 00:01:33,597
out queue with multiple readers. 
Yeah so let's, let's walk through that. 

22
00:01:33,597 --> 00:01:38,796
Let's say two consumers, 
our execute, we, we interweave two 

23
00:01:38,796 --> 00:01:45,729
consumers executing the same instruction, 
from one consumer thread and the other 

24
00:01:45,729 --> 00:01:51,910
consumer thread every other cycle. 
So we have, the first consumer will read 

25
00:01:51,910 --> 00:01:56,924
the head pointer into the head. 
The second consumer will read the head 

26
00:01:56,924 --> 00:02:03,505
pointer into its own head register. 
This consumer here will read the tail 

27
00:02:03,505 --> 00:02:08,707
pointer into, the tail register. 
The second consumer will do that same 

28
00:02:08,707 --> 00:02:12,525
thing. 
So all of a sudden the two consumers read 

29
00:02:12,525 --> 00:02:16,970
the same heads and the same tail. 
And this is sequentially consistent 

30
00:02:16,970 --> 00:02:21,543
because we've not done run, done no 
reordering between within the thread. 

31
00:02:21,543 --> 00:02:26,170
we've just done some valid interlead. 
Now they both check this. 

32
00:02:26,170 --> 00:02:29,633
This is the, the case which is basically 
saying is there anything in the queue. 

33
00:02:29,633 --> 00:02:32,520
Does the head equal the tail? 
If the head equals the tail there's 

34
00:02:32,520 --> 00:02:35,184
nothing in the queue. 
If there's a distance between the head 

35
00:02:35,184 --> 00:02:39,064
and the tail that means there's some 
entries in the queue So they both fall 

36
00:02:39,064 --> 00:02:42,980
through an they say, well yeah. 
There's, there's stuff in the queue. 

37
00:02:42,980 --> 00:02:45,874
The next thing they do is they go read 
from the head. 

38
00:02:45,874 --> 00:02:48,332
This is the value that they are going to 
use. 

39
00:02:48,332 --> 00:02:50,900
Well they both just read the same head 
pointer. 

40
00:02:53,140 --> 00:02:58,289
Because we are going to execute from one 
thread and we're going to load that value 

41
00:02:58,289 --> 00:03:02,216
into another thread. 
So all of a sudden we just read the same 

42
00:03:02,216 --> 00:03:07,430
value twice and one is the one thread and 
one is the other thread but it was the 

43
00:03:07,430 --> 00:03:12,322
same, same value that we get out of 
reading from the, the pointer that points 

44
00:03:12,322 --> 00:03:15,300
to the, the, the head if you will. 
Now. 

45
00:03:15,300 --> 00:03:20,105
In a perfect world, that would be the 
only bad thing that could happen in this 

46
00:03:20,105 --> 00:03:24,542
case because we would incorrect, they'd 
both incorrect the headpointers, then, 

47
00:03:24,542 --> 00:03:29,656
then they do stores here and effectively 
what would happen is they would write the 

48
00:03:29,656 --> 00:03:34,831
same head location into the, the, or the 
same pointer if you will into, into the 

49
00:03:34,831 --> 00:03:37,610
head. 
That's not so bad. 

50
00:03:37,610 --> 00:03:41,251
But effectively what we did is we 
enqueued one thing into the queue, and 

51
00:03:41,251 --> 00:03:45,399
then we got two things out of the queue. 
This breaks our semantics of what a queue 

52
00:03:45,399 --> 00:03:50,298
should actually do. 
Something that could happen that's much 

53
00:03:50,298 --> 00:03:53,649
worse here. 
Is that one of the threads, let's say 

54
00:03:53,649 --> 00:03:58,557
another valid interweaving, is the one 
thread just stalls for a long time in 

55
00:03:58,557 --> 00:04:03,910
here. 
And, like a billion cycles, it just falls 

56
00:04:03,910 --> 00:04:08,182
for a billion cycles. 
It takes really bad cachmus or it traps 

57
00:04:08,182 --> 00:04:11,987
into the operating system, 
or something bad happens to it. 

58
00:04:11,987 --> 00:04:17,160
What can happen is before the update at 
the headpointer happens, 

59
00:04:17,160 --> 00:04:19,780
or 
what could, what could have happened here 

60
00:04:19,780 --> 00:04:23,056
is you could have something really 
strange happen. 

61
00:04:23,056 --> 00:04:27,970
Or, or, you could basically have the 
other thread execute, and it could pop a 

62
00:04:27,970 --> 00:04:32,163
whole bunch of values off the head of the 
queue in the meantime. 

63
00:04:32,163 --> 00:04:35,242
So let's say there was 100 things in the 
queue. 

64
00:04:35,242 --> 00:04:38,840
We both execute into here at the same 
time. 

65
00:04:38,840 --> 00:04:42,540
One of the threads get stalled here for a 
long period of time. 

66
00:04:42,540 --> 00:04:45,345
The other thread pops 100 things off the 
queue. 

67
00:04:45,345 --> 00:04:49,582
But then finally, this one here goes and 
updates the head pointer again. 

68
00:04:49,582 --> 00:04:54,237
So what's going to happen, is effectively 
you are basically going to reset the 

69
00:04:54,237 --> 00:04:56,922
queue and put 100 things back onto the 
queue. 

70
00:04:56,922 --> 00:05:01,160
And, it's probably going to be garbage in 
memory, in the queue at that time. 

71
00:05:03,180 --> 00:05:10,464
So this brings up the idea that we may 
not want to be concurrently executing 

72
00:05:10,464 --> 00:05:14,974
even the sequentially, even the strict 
sequentially consistent model. 

73
00:05:14,974 --> 00:05:20,213
This block of code and another threads 
piece of code which is executing the 

74
00:05:20,213 --> 00:05:25,121
same, same thing, the, the two consumers. 
So we're going to call this a critical 

75
00:05:25,121 --> 00:05:28,503
section. 
if you take an operating system's class 

76
00:05:28,503 --> 00:05:33,477
you will seen this, seen this a bunch. 
And the idea here is that you want to 

77
00:05:33,477 --> 00:05:36,860
atomically execute from all the consumers 
possible. 

78
00:05:36,860 --> 00:05:40,560
This block of code here. 
So no two consumers could begin to 

79
00:05:40,560 --> 00:05:43,320
execute that block of code at the same 
time. 

80
00:05:44,700 --> 00:05:51,096
And one way to do that is you have a lock 
on that piece of code. 

81
00:05:51,096 --> 00:05:58,224
So you grab a lock, you can go execute 
that piece of code, and then you release 

82
00:05:58,224 --> 00:06:00,600
the lock. 
Questions so far? 

83
00:06:03,360 --> 00:06:11,865
Okay, so a little bit more general notion 
of the same idea of a lock is called a 

84
00:06:11,865 --> 00:06:16,543
semaphore. 
And if you've taken !!! class you've 

85
00:06:16,543 --> 00:06:23,347
probably seen these strange nomenclatures 
of P and V semaphores. 

86
00:06:23,347 --> 00:06:31,533
And if you want to go and look up where 
it actually comes from, it's Dutch. 

87
00:06:31,533 --> 00:06:37,639
And P largely means attempt to acquire 
the semaphore. 

88
00:06:37,639 --> 00:06:41,712
But, in, in, in Dutch it means I don't 
speak. 

89
00:06:41,712 --> 00:06:45,243
Does anyone here speak Dutch? 
By chance? 

90
00:06:45,243 --> 00:06:49,317
Anyone speak anything close to Dutch 
[LAUGH]? 

91
00:06:49,317 --> 00:06:50,947
You do? 
Little bit? 

92
00:06:50,947 --> 00:06:54,930
Okay, how do we, how do we, how do we say 
that? 

93
00:06:54,930 --> 00:07:02,784
What, what is close to Dutch actually? 
German's close to Dutch? 

94
00:07:02,784 --> 00:07:03,669
Yeah. 
Okay. 

95
00:07:03,669 --> 00:07:08,094
I guess it's. 
so I don't, I don't speak Dutch or 

96
00:07:08,094 --> 00:07:15,260
German, but it means try to reduce. 
In a semaphore you can have multiple, 

97
00:07:15,260 --> 00:07:19,630
people trying to grab a lock at the same 
time and multiple people trying to 

98
00:07:19,630 --> 00:07:22,217
caching to aquire that lock at the same 
time. 

99
00:07:22,217 --> 00:07:24,805
So an example, or aquire for the same 
time. 

100
00:07:24,805 --> 00:07:29,117
The example that we gave in the last 
class is that you have multiple users 

101
00:07:29,117 --> 00:07:33,717
trying to use one I/O device, and let's 
say that there are two queues in the I/O 

102
00:07:33,717 --> 00:07:36,989
device. 
You have 100 different processors or 

103
00:07:36,989 --> 00:07:39,792
threads trying to use that one I/O 
device. 

104
00:07:39,792 --> 00:07:45,197
You can actually have strictly only two 
acquire the I/O device, but not more than 

105
00:07:45,197 --> 00:07:48,067
two. 
In a Mutex, you're only allowed to have 

106
00:07:48,067 --> 00:07:53,406
one thing be able to acquire the critical 
section or acquire the lock at a time. 

107
00:07:53,406 --> 00:07:57,276
And, hence, we have a little bit more 
general term for this. 

108
00:07:57,276 --> 00:08:00,780
A P semaphore basically, decrements a 
counter. 

109
00:08:00,780 --> 00:08:06,227
So we start off with a counter S, here. 
We put some number into it, 

110
00:08:06,227 --> 00:08:12,178
and then we do P on that, 
it'll actually decrease the counter, and 

111
00:08:12,178 --> 00:08:18,580
if the counter was greater than zero, 
we can go execute our Kregel section. 

112
00:08:19,840 --> 00:08:23,689
Then we have the release for the V 
semaphore. 

113
00:08:23,689 --> 00:08:30,191
And that will actually increment this 
counter by one and it'll wake up some 

114
00:08:30,191 --> 00:08:35,410
other process if there is some other 
process waiting on that. 

115
00:08:35,410 --> 00:08:42,340
if, if, if, I said dropped below, one 
it'll go wake up another process [COUGH]. 

116
00:08:42,340 --> 00:08:46,222
And, one of the important things here is 
that these, both these P and these V 

117
00:08:46,222 --> 00:08:48,725
semaphores must be atomic, relative to 
each other. 

118
00:08:48,725 --> 00:08:52,658
So however you go, thinking about going 
to implement them, they must, they must 

119
00:08:52,658 --> 00:08:56,131
be relative to each other. 
And this is, with respect to everything 

120
00:08:56,131 --> 00:08:58,736
happening, 
everything else happening on your system. 

121
00:08:58,736 --> 00:09:01,597
Otherwise, really, really bad things will 
start to happen. 

122
00:09:01,597 --> 00:09:05,220
You'll actually, have multiple people 
grabbing the lock. 

123
00:09:05,220 --> 00:09:10,595
And how you would typically use this is 
inside of your thread or your process. 

124
00:09:10,595 --> 00:09:16,246
You're going to put your critical section 
in, put the acquire here, and the release 

125
00:09:16,246 --> 00:09:20,681
there. 
And one of the important things that, 

126
00:09:20,681 --> 00:09:25,458
here to note is these semaphores. 
You can have the value S here determine 

127
00:09:25,458 --> 00:09:30,512
how many the, the initial value of S 
rather determines how many different 

128
00:09:30,512 --> 00:09:35,704
threads or processes are able to enter 
the curricular section concurrently. 

129
00:09:35,704 --> 00:09:41,520
If S is initially such one that means one 
only one thread is allowed to go in there 

130
00:09:41,520 --> 00:09:46,296
and that's what's called a Mutex. 
so it's mutually exclusive, only one 

131
00:09:46,296 --> 00:09:49,966
thread can go into a curricular section 
at that time. 

132
00:09:49,966 --> 00:09:50,520