1
00:00:04,180 --> 00:00:07,210
Okay. 
So now, let's briefly touch on routing 

2
00:00:07,210 --> 00:00:10,384
protocols. 
There's lots of different routing 

3
00:00:10,384 --> 00:00:15,291
protocols, protocols out there. 
And roughly let's, let's split them into 

4
00:00:15,291 --> 00:00:19,836
two different categories. 
Oblivious routing protocols where the 

5
00:00:19,836 --> 00:00:25,720
routing path through the network is 
independent of the state of the network. 

6
00:00:25,720 --> 00:00:30,181
And the second one is adaptive routing. 
So these are sort of smarter networks. 

7
00:00:30,181 --> 00:00:33,368
They look for 
they somehow look at the state of the 

8
00:00:33,368 --> 00:00:37,598
network, and make some decisions. 
And if you build a good adaptive routing 

9
00:00:37,598 --> 00:00:42,060
network, what you're going to try and do 
is look for the hot spots in the routing 

10
00:00:42,060 --> 00:00:46,174
network, and route around them. 
And this is where that path independence, 

11
00:00:46,174 --> 00:00:49,027
or multi path 
networks comm, commit. 

12
00:00:49,027 --> 00:00:53,587
If you only have one route from any point 
to any other given point, the routing is 

13
00:00:53,587 --> 00:00:57,035
basically deterministic. 
You can't, can't go route in the wrong, 

14
00:00:57,035 --> 00:01:01,372
you can't go route someplace else because 
you'll never be able to get to your 

15
00:01:01,372 --> 00:01:04,585
destination. 
Where this becomes interesting is if you, 

16
00:01:04,585 --> 00:01:08,326
if you look at just the basic mesh 
network. Our basic mesh network has made 

17
00:01:08,326 --> 00:01:12,316
it from points between, or many different 
routes between any two points. So, if we 

18
00:01:12,316 --> 00:01:16,456
have let's say bre, four by four grid you 
can just sort of go across the top and 

19
00:01:16,456 --> 00:01:20,596
then along the side, let's say if you're 
going from the top to the bottom or you 

20
00:01:20,596 --> 00:01:23,390
can sort of zigzag along the middle or go 
the other way. 

21
00:01:23,390 --> 00:01:27,280
Go Y and then X, or X and then Y, just 
lots of different choices along, along 

22
00:01:27,280 --> 00:01:31,654
there. 
So that's where we start to get to decide 

23
00:01:31,654 --> 00:01:38,362
about different routing networks. under 
oblivious newtorks, there's, there's two 

24
00:01:38,362 --> 00:01:42,947
sort of sub points here. 
One is deterministic which basically 

25
00:01:42,947 --> 00:01:47,094
says, do the same thing all the time, 
versus a non-deterministic design. 

26
00:01:47,094 --> 00:01:51,359
And, you might say, what is, what is a 
non-deterministic oblivious routing 

27
00:01:51,359 --> 00:01:54,380
algorithm? 
Well, it's typically a randomized routing 

28
00:01:54,380 --> 00:01:57,461
algorithm. 
And these can actually do slightly better 

29
00:01:57,461 --> 00:02:00,541
than the deterministic oblivious routing 
algorithms. 

30
00:02:00,541 --> 00:02:05,103
So let's take an example of this 
non-deterministic, but still doesn't look 

31
00:02:05,103 --> 00:02:09,545
at the state of the network. So, let's, 
an example of this is something where 

32
00:02:09,545 --> 00:02:14,311
you, let's say, choose some node in the 
system at random and you route there 

33
00:02:14,311 --> 00:02:17,181
first. 
And then, once you've routed there, you 

34
00:02:17,181 --> 00:02:21,835
route X and then Y to the destination. 
I would say, why would you want to do 

35
00:02:21,835 --> 00:02:24,090
this? 
Well, it's actually sort of scatters your 

36
00:02:24,090 --> 00:02:27,882
traffic around the network and actually 
route in different directions and 

37
00:02:27,882 --> 00:02:32,084
potentially could increase the delivered 
aggregate bandwidth through your network 

38
00:02:32,084 --> 00:02:36,030
by doing this randomized choice at the 
beginning where you choose some other 

39
00:02:36,030 --> 00:02:38,387
node. 
Which might be routing a wrong, the wrong 

40
00:02:38,387 --> 00:02:42,230
direction first, but then you'll have 
more aggregate balance, let's say to a 

41
00:02:42,230 --> 00:02:44,126
destination. 
So, you can actually have 

42
00:02:44,126 --> 00:02:49,216
non-deterministic oblivious routing. 
[COUGH] Adaptive routing, I'm going to 

43
00:02:49,216 --> 00:02:54,705
sort of not talk about in this class 
because there's a, it's a very active 

44
00:02:54,705 --> 00:03:00,937
research area and I recommend you take 
588 if you want to learn more about that 

45
00:03:00,937 --> 00:03:07,529
because that's, that's a very broad topic 
where you try to look for congestions 

46
00:03:07,529 --> 00:03:12,891
network and route around the congestion. 
And if you try to do that, you have to 

47
00:03:12,891 --> 00:03:18,879
make sure you're network doesn't deadlock 
or have other bad conditions that happen. 

48
00:03:18,879 --> 00:03:24,728
And to some extend usually these networks 
trade off let's say ease of analysis at 

49
00:03:24,728 --> 00:03:31,014
least for performance. 
I wanted to talk about one quick 

50
00:03:31,014 --> 00:03:36,980
deterministic oblivious routing algorithm 
on a mesh. 

51
00:03:36,980 --> 00:03:59,606
So let's, let's draw a mesh here. 
[SOUND] Okay. 

52
00:03:59,606 --> 00:04:23,920
So, we have our sixteen node, four array, 
two cube here. 

53
00:04:23,920 --> 00:04:30,320
And we wan to route let's say from A to 
B. 

54
00:04:35,220 --> 00:04:45,185
The most basic, or one of the more basic 
routing algorithms that we're going to be 

55
00:04:45,185 --> 00:04:52,355
looking at here we're going to call 
dimension ordered routing. 

56
00:04:52,355 --> 00:05:01,955
[COUGH] And it's called dimension ordered 
routing because there's actually two 

57
00:05:01,955 --> 00:05:07,069
dimensions here. 
We'll label this as let's say X 

58
00:05:07,069 --> 00:05:11,440
increasing, 
we'll label this as Y increasing. 

59
00:05:11,440 --> 00:05:16,160
And, [COUGH] and dimension order routing 
says that we order the dimensions in 

60
00:05:16,160 --> 00:05:19,286
which we route. 
So, we always route it in one of the 

61
00:05:19,286 --> 00:05:23,060
dimensions before we go to route in the 
other dimension. 

62
00:05:23,060 --> 00:05:30,703
So, if we want to get from here to here, 
and let's say we have X then Y is our 

63
00:05:30,703 --> 00:05:35,468
routing, 
we're going to route this way, from A to 

64
00:05:35,468 --> 00:05:42,814
B first. So, now we've matched our X 
location to where B is, and then we're 

65
00:05:42,814 --> 00:05:48,760
going to route down. 
So, this is X and Y. 

66
00:05:48,760 --> 00:05:51,430
This is oblivious. 
It doesn't matter about the traffic. 

67
00:05:51,430 --> 00:05:55,560
And it has some bad problems that show up 
if you do this. 

68
00:05:55,560 --> 00:06:01,250
[COUGH] If we have let's say C and D, 
and it's trying to route from C to D, 

69
00:06:01,250 --> 00:06:04,672
it needs to share this link with traffic 
going from A to B. 

70
00:06:04,672 --> 00:06:08,734
And it can't route around it. 
It can't like, I don't use those links or 

71
00:06:08,734 --> 00:06:11,866
something else. 
It's just not allowed in our oblivious 

72
00:06:11,866 --> 00:06:14,883
routing algorithm, 
and definitely not allowed in our 

73
00:06:14,883 --> 00:06:17,204
dimension ordered. 
Now, dimension ordered, 

74
00:06:17,204 --> 00:06:21,671
there's networks that are, let's say, X 
and then Y and you can also have, let's 

75
00:06:21,671 --> 00:06:25,326
say, Y and then X. 
You can, this also generalizes to three 

76
00:06:25,326 --> 00:06:30,780
dimensional n-dimensional sort of spaces. 
You do, let's say X, then Y, then Z. 

77
00:06:30,780 --> 00:06:36,255
Or W, then X, then Y and then Z or 
something like that in a four dimensional 

78
00:06:36,255 --> 00:06:43,096
cube. 
[COUGH] But one of the good properties of 

79
00:06:43,096 --> 00:06:46,640
dimension order routing is it's not going 
to deadlock. 

80
00:06:46,640 --> 00:06:49,396
And we'll talk about that in a few 
slides. 

81
00:06:49,396 --> 00:06:52,677
But the network itself is never going to 
deadlock. 

82
00:06:52,677 --> 00:06:56,549
That doesn't mean you can't get message 
dependent deadlock. 

83
00:06:56,549 --> 00:07:01,602
But if you wormhole through the network 
here, you can actually come up with a 

84
00:07:01,602 --> 00:07:06,196
proof that shows that [COUGH] you will 
not ever deadlock this network. 

85
00:07:06,196 --> 00:07:10,068
And the, the rough sketch of the proof 
look goes like this. 

86
00:07:10,068 --> 00:07:14,740
[COUGH] By only using X links first and 
then using Y links, 

87
00:07:14,740 --> 00:07:21,857
[COUGH] you only ever acquire an X link, 
and then acquire a Y link, 

88
00:07:21,857 --> 00:07:25,971
and then release an X link, 
and then release a Y link. 

89
00:07:25,971 --> 00:07:32,206
So, you'll never actually grab a resource 
that you have needed, you have needed 

90
00:07:32,206 --> 00:07:36,080
before. 
And everyone else is doing that same 

91
00:07:36,080 --> 00:07:40,423
algorithm so they will never try to grab 
a resource in order that is, is, is 

92
00:07:40,423 --> 00:07:43,500
faulty also. 
So, if we go back to our sort of dining 

93
00:07:43,500 --> 00:07:47,904
philosophers problem here, you're 
guaranteed that you're going to acquire 

94
00:07:47,904 --> 00:07:50,498
resource one and then acquire resource 
two. 

95
00:07:50,498 --> 00:07:55,445
And everyone's doing that same algorithm 
so they will always acquire resource one 

96
00:07:55,445 --> 00:07:59,970
or block, waiting to acquire resource one 
and then you acquire resource two. 

97
00:07:59,970 --> 00:08:04,074
But you'll never see a cycle. 
So, because of that, you'll next acquire 

98
00:08:04,074 --> 00:08:07,636
X, and then Y, and then X in this 
dimension order of routing. 

99
00:08:07,636 --> 00:08:10,835
So, you'll never get a cycle in your 
dependency graph. 

100
00:08:10,835 --> 00:08:14,699
That's the rough sketch. 
And the, the, the full sketch is a little 

101
00:08:14,699 --> 00:08:18,260
more complicated. 
Usually, you have these two dimensional 

102
00:08:18,260 --> 00:08:21,641
mesh networks without end around, or 
without a taurus. 

103
00:08:21,641 --> 00:08:25,806
You have to sort of prove it in this 
direction and in that direction. 

104
00:08:25,806 --> 00:08:30,334
And then, sort of it, decomposes into 
sort of X from west to east, and X from 

105
00:08:30,334 --> 00:08:34,137
east to west, and vice versa. 
But, that's the rough sketch of the 

106
00:08:34,137 --> 00:08:37,510
proof. 
But, I just want to get across that. 

107
00:08:37,510 --> 00:08:41,076
It's a basic routing algorithm, but 
there's lots of other choices here. 

108
00:08:41,076 --> 00:08:44,898
as I sort of said there was this 
non-deterministic one, there's adaptive 

109
00:08:44,898 --> 00:08:47,904
routing algorithms. 
So, a good adaptive one, sometimes called 

110
00:08:47,904 --> 00:08:51,878
hot potato routing, is that if you, you 
don't have any buffering in the network, 

111
00:08:51,878 --> 00:08:56,158
so you can't buffer along the way. If you 
get to a node which has a link which is 

112
00:08:56,158 --> 00:08:59,980
congested, you just choose randomly some 
other direction, and you route that 

113
00:08:59,980 --> 00:09:02,595
direction. 
[COUGH] And the, the thinking here is 

114
00:09:02,595 --> 00:09:07,147
that some point you will actually be able 
to choose again and route towards the 

115
00:09:07,147 --> 00:09:11,529
right direction and slowly you'll move 
your, your traffic or your packet will 

116
00:09:11,529 --> 00:09:15,000
move to the end point. 
But, it allows you to sort of go around 

117
00:09:15,000 --> 00:09:18,756
traffic congestion points. 
But, I don't want to go into too much 

118
00:09:18,756 --> 00:09:22,520
detail on this because this is a whole 
lecture in itself. 

119
00:09:22,520 --> 00:09:26,160
Okay. 
So, let's talk about flow control. 

120
00:09:26,160 --> 00:09:31,540
This is a aspect of networking that I 
really enjoy, actually. 

121
00:09:32,680 --> 00:09:36,660
So, flow control is making sure that you 
don't drop data. 

122
00:09:36,660 --> 00:09:42,120
Making sure that you have back pressure 
in your networks so that you don't lose, 

123
00:09:42,120 --> 00:09:46,352
lose data in your network. 
And, this flow control can either be 

124
00:09:46,352 --> 00:09:51,415
local on a link by link or hop by hop 
basis, or it could be round trip. 

125
00:09:51,415 --> 00:09:56,186
So a good example of round-trip flow 
control is going back to something like 

126
00:09:56,186 --> 00:09:58,691
TCP/IP. 
You actually want to rate-limit the 

127
00:09:58,691 --> 00:10:03,282
round-trip flow control so you don't 
flood the network if a given link is, is 

128
00:10:03,282 --> 00:10:06,264
too small. 
So there's actually a protocol in there 

129
00:10:06,264 --> 00:10:09,245
which will rate-limit the round-trip flow 
control. 

130
00:10:09,245 --> 00:10:13,897
Or a good example, actually, on a on-chip 
network, so something like a mini-core, 

131
00:10:13,897 --> 00:10:18,733
[COUGH] is if you're trying to guarantee 
that you do not overrun a buffer in the 

132
00:10:18,733 --> 00:10:22,365
memory controller. 
But lots of different nodes are trying to 

133
00:10:22,365 --> 00:10:26,542
route to the memory controller. 
You can actually have some sort of 

134
00:10:26,542 --> 00:10:30,737
counter, which effectively is a round 
trip flow control to the far away end 

135
00:10:30,737 --> 00:10:34,688
point node and say, okay, you are allowed 
to have five outstanding memory 

136
00:10:34,688 --> 00:10:38,969
transactions to the memory controller. 
But if you try to send a sixth, you need 

137
00:10:38,969 --> 00:10:45,541
to wait for the app to come back. 
So, this is a end to end or long distance 

138
00:10:45,541 --> 00:10:50,484
flow control, but each of the hops along 
network themselves may have localized 

139
00:10:50,484 --> 00:10:53,906
full control also. 
So, you can actually layer these two 

140
00:10:53,906 --> 00:10:57,939
things together. 
And typically, sort of, for on chip 

141
00:10:57,939 --> 00:11:02,770
networks, we would like some properties 
and even in computer system networks, 

142
00:11:02,770 --> 00:11:05,644
we typically like the networks to be 
resilient. 

143
00:11:05,644 --> 00:11:08,518
so we don't want sort of data to be 
dropped. 

144
00:11:08,518 --> 00:11:11,269
But that's not, not necessarily 
guaranteed. 

145
00:11:11,269 --> 00:11:15,244
People have built networks where links 
can drop, can drop data. 

146
00:11:15,244 --> 00:11:20,259
So that's another option is, you can drop 
the data and try to retransmit it if you 

147
00:11:20,259 --> 00:11:23,500
want reliable transmission from one point 
to another. 

148
00:11:25,500 --> 00:11:31,540
Okay. 
So, let's look at a quick example here of 

149
00:11:31,540 --> 00:11:37,558
flow control in our networks. This is not 
in your handouts. so you just, pay 

150
00:11:37,558 --> 00:11:42,749
attention. so the first type of flow 
control we're going to look at is 

151
00:11:42,749 --> 00:11:46,227
basically a stall wire, 
or a big stop sign. 

152
00:11:46,227 --> 00:11:50,736
So here or is what we have, is we have a 
sender, with a Q. 

153
00:11:50,736 --> 00:11:55,095
So Q sub sender and Q sub R, or Q sub 
receiver. 

154
00:11:55,095 --> 00:11:58,853
And we've named sort of all the points 
along here. 

155
00:11:58,853 --> 00:12:04,714
let's say, in this network we have a 
register in between the sender and the 

156
00:12:04,714 --> 00:12:09,449
receiver and this is in our link. 
This is sort of our link cost. 

157
00:12:09,449 --> 00:12:16,000
Takes a cycle and we pipeline the wire. 
[COUGH] Okay. So now, 

158
00:12:18,500 --> 00:12:24,598
data arrives here into D, which is our 
sort of receive point into the receiver. 

159
00:12:24,598 --> 00:12:28,976
And at some point it decides, I can't 
take any more data. 

160
00:12:28,976 --> 00:12:33,745
I can't, can't read that data. 
I need to go process some of it. 

161
00:12:33,745 --> 00:12:38,120
So, I need to stall the network. 
So it says, stop. 

162
00:12:38,120 --> 00:12:42,148
So it says stop here. 
And in a perfect world, this goes through 

163
00:12:42,148 --> 00:12:45,982
some combinational logic. 
Just completely, just like in your 

164
00:12:45,982 --> 00:12:50,856
pipeline stalls this point in the pipe 
stalls. This flip flop stalls that Q, 

165
00:12:50,856 --> 00:12:54,300
stalls the, going this way. 
Stalls A, stalls the sender, 

166
00:12:54,300 --> 00:12:57,160
tells it not to actually send any more 
data. 

167
00:12:58,500 --> 00:13:01,568
So it's a stall. 
We stop in D. 

168
00:13:01,568 --> 00:13:06,080
C stops, B stops, A stops 
instantaneously. 

169
00:13:06,080 --> 00:13:10,828
At some point, D unstalls because it's 
going to go read data more again and the 

170
00:13:10,828 --> 00:13:13,077
network keep flowing again. 
So, sorry. 

171
00:13:13,077 --> 00:13:15,700
What this is showing is, these are 
packets. 

172
00:13:15,700 --> 00:13:19,386
These are not FITS. 
These are complete packets getting sent, 

173
00:13:19,386 --> 00:13:23,197
or complete FLITS getting sent. 
This is our flow control unit. 

174
00:13:23,197 --> 00:13:27,758
So we're looking at each one of these A, 
B's, and C's here, and D's are, are 

175
00:13:27,758 --> 00:13:33,774
uniquely flow controllable units. 
So that's on/off flow control for 

176
00:13:33,774 --> 00:13:38,660
combinational stall single. Now, the 
problem with this is, it took us a cycle 

177
00:13:38,660 --> 00:13:43,741
to send data this wire because we had a 
flip flp going that way, but our stall 

178
00:13:43,741 --> 00:13:48,637
wire runs all the way back. 
well, that's not, that's not good from a 

179
00:13:48,637 --> 00:13:52,010
cycle time perspective. 
because if you look at the combinational 

180
00:13:52,010 --> 00:13:55,946
sort of path now, you're going to see 
well, maybe you have an output of this 

181
00:13:55,946 --> 00:14:00,219
which says this data available that goes 
through some logic in D and that 

182
00:14:00,219 --> 00:14:03,592
generates a stall signal. 
And then, has to propagate physically a 

183
00:14:03,592 --> 00:14:07,460
long distance away and go into lots of 
other logic. 

184
00:14:07,460 --> 00:14:09,823
And after having designed a couple of 
these sorts of networks, 

185
00:14:09,823 --> 00:14:11,501
you're never going to be able to build 
that. 

186
00:14:11,501 --> 00:14:13,255
That's, 
if you want to have a high performance 

187
00:14:13,255 --> 00:14:15,970
network, 
that time is, is, is, is too long. 

188
00:14:15,970 --> 00:14:20,914
because if your cycle time is anywhere 
near the length of wire, of the delay of 

189
00:14:20,914 --> 00:14:25,940
length wire you have, you're not going to 
be able to build them. 

190
00:14:25,940 --> 00:14:28,275
Okay. 
So now, we start to look at, we, we 

191
00:14:28,275 --> 00:14:34,261
switch to something and say, let's try to 
solve that and put a pipeline flop in 

192
00:14:34,261 --> 00:14:41,220
here on the receive on the return path of 
the stall wire, of the stall signal. 

193
00:14:41,220 --> 00:14:43,406
Okay. 
So, what, what happens? 

194
00:14:43,406 --> 00:14:49,966
Well, if we look at this for packet zero, 
one, two, three, at some point d decides 

195
00:14:49,966 --> 00:14:51,800
to stall. 
But, 

196
00:14:51,800 --> 00:15:04,600
[COUGH] that is not able to stop let's 
say, R from sending the cycle before. 

197
00:15:04,600 --> 00:15:07,571
Because we now have a flip flop in this 
path. 

198
00:15:07,571 --> 00:15:12,128
So, it doesn't know about that. 
Likewise, it doesn't know, it can't stop 

199
00:15:12,128 --> 00:15:15,100
B and A from moving forward one cycle 
either. 

200
00:15:16,760 --> 00:15:20,964
So, if we look at the pipeline diagram, 
what that really means is these arrows 

201
00:15:20,964 --> 00:15:25,059
here, we can't stall this on a dime. 
We have to wait and we actually are not 

202
00:15:25,059 --> 00:15:29,263
able to stall until the next packet. 
Likewise, we're not able to stall B and A 

203
00:15:29,263 --> 00:15:32,048
until later. 
And this can even be worse if you have 

204
00:15:32,048 --> 00:15:36,700
multiple pipeline registers going 
backwards here, which is pretty common. 

205
00:15:36,700 --> 00:15:45,402
Now, the implication of this is, 
you were not able to stall C at all. 

206
00:15:45,402 --> 00:15:51,640
So, like it or not, the data's coming 
this direction from our D, 

207
00:15:51,640 --> 00:15:54,293
Q sub R. 
Huh. 

208
00:15:54,293 --> 00:15:57,740
So, what do you go about doing at this 
point? 

209
00:15:59,140 --> 00:16:02,680
So, this is a tough one. 
you could either drop the data. 

210
00:16:02,680 --> 00:16:07,153
That's probably not very advantageous 
because you're just going to drop random 

211
00:16:07,153 --> 00:16:10,880
data in your network. 
Networks like to actually deliver data. 

212
00:16:12,020 --> 00:16:17,781
A better thing to do is to have extra 
buffering in Q sub R, 

213
00:16:17,781 --> 00:16:23,328
and we'll call this skid buffering. 
So that the data, 

214
00:16:23,328 --> 00:16:26,839
it's called skid buffering because it's 
kind of like in a car. 

215
00:16:26,839 --> 00:16:30,894
When you hit the stop pedal or you hit 
the, hit the, the brake pedal, 

216
00:16:30,894 --> 00:16:34,888
you need some runway to, to skid into 
before you can actually stop. 

217
00:16:34,888 --> 00:16:38,701
So, you hit the brake, 
and you're a sort of skid to a stop and 

218
00:16:38,701 --> 00:16:43,119
you need that, that extra area. 
So, we budget that into this Q here and 

219
00:16:43,119 --> 00:16:47,598
you can actually build a network 
build a flow control system where you 

220
00:16:47,598 --> 00:16:54,260
have extra space in Q sub R to skid into. 
And then when you unstall, 

221
00:16:55,700 --> 00:17:02,279
you'll see actually that there's 
basically that should be a D, 

222
00:17:02,279 --> 00:17:04,492
sorry. 
it takes a cycle to unstall also. 

223
00:17:04,492 --> 00:17:08,286
So that's, that's kind of not good. 
Conveniently, you have something to feed 

224
00:17:08,286 --> 00:17:11,342
at that point because we have that extra 
value in Q sub R. 

225
00:17:11,342 --> 00:17:15,453
But there are cases that you can come up 
to with, where you stall and then not 

226
00:17:15,453 --> 00:17:17,560
stall. 
And then stall and then not stall. 

227
00:17:17,560 --> 00:17:21,618
Where you can actually accidentally 
introduce bubbles with this type of flow 

228
00:17:21,618 --> 00:17:23,989
control. 
And I don't have a, a diagram of that 

229
00:17:23,989 --> 00:17:26,255
here. 
But that would show up as basically a 

230
00:17:26,255 --> 00:17:29,259
bubble throughout the entire pipe and no 
data being read. 

231
00:17:29,259 --> 00:17:33,474
But that's, you can construct that with a 
sort of on/off flow control pipe line 

232
00:17:33,474 --> 00:17:37,408
flow control. 
[COUGH] So, you could also think about 

233
00:17:37,408 --> 00:17:41,243
having on/off flow control of partial 
pipeline stall control. 

234
00:17:41,243 --> 00:17:45,769
So what I mean by that is, instead of 
feeding up into this register here, 

235
00:17:45,769 --> 00:17:50,484
now we need two skid buffer entries over 
there because we can't stop B flowing 

236
00:17:50,484 --> 00:17:55,262
into here or C flowing into there. 
So we can't stop any of this stuff and we 

237
00:17:55,262 --> 00:18:01,867
need two entries now in that skid buffer. 
And we're running out of time, but I 

238
00:18:01,867 --> 00:18:06,116
wanted to try to finish this flow control 
notion up here. 

239
00:18:06,116 --> 00:18:10,162
A good solution to this is called 
Credit-Based flow control. 

240
00:18:10,162 --> 00:18:17,708
So, in credit-based flow control, 
you put a counter in the center and the 

241
00:18:17,708 --> 00:18:22,580
counter counts how many entries there are 
in the receive FIFO. 

242
00:18:24,100 --> 00:18:27,600
And what it does is it starts sending 
data. 

243
00:18:29,260 --> 00:18:33,819
And whenever data is removed from this 
FIFO, a credit gets sent back. 

244
00:18:33,819 --> 00:18:34,959
And it can be, 
if, 

245
00:18:34,959 --> 00:18:40,060
like it can be pipelines multiple levels 
deep here on the return, point. 

246
00:18:40,060 --> 00:18:44,720
When the credit comes back, you increment 
the counter. But when you send a word, 

247
00:18:44,720 --> 00:18:48,723
you decrement the counter. 
And if the counter ever reaches zero, you 

248
00:18:48,723 --> 00:18:54,696
stop sending. 
This is actually ends up being superior 

249
00:18:54,696 --> 00:18:59,711
than this order design because you don't 
have funny stutters on the receive side 

250
00:18:59,711 --> 00:19:03,178
when you go to build a credit-based flow 
control system. 

251
00:19:03,178 --> 00:19:08,131
That's a little bit beyond, showing that 
proof is a little beyond this course. 

252
00:19:08,131 --> 00:19:13,207
But I did want to say here that when you 
compute this, you need to compute how 

253
00:19:13,207 --> 00:19:17,603
large the credit needs to be and how big 
this receive FIFO needs to be. 

254
00:19:17,603 --> 00:19:23,500
And they need to be sized appropriately. 
[COUGH] If you size one of them wrong, 

255
00:19:24,580 --> 00:19:29,294
in this case, if you size one of them 
wrong in the on/off flow control, 

256
00:19:29,294 --> 00:19:32,056
you're actually going to end up losing 
data. 

257
00:19:32,056 --> 00:19:36,636
If you skid into your skid buffer and you 
need two skid entry slots, 

258
00:19:36,636 --> 00:19:40,745
and you only have one skid entry slot, 
you're going to lose data. 

259
00:19:40,745 --> 00:19:45,720
In the credit-based flow control system, 
the counter just reaches zero. 

260
00:19:45,720 --> 00:19:49,868
So, just impact your bandwidth, but 
you're not going to lose data. 

261
00:19:49,868 --> 00:19:54,664
Now, having said that, you should, if you 
want high performance through this 

262
00:19:54,664 --> 00:19:59,590
network, you should size this and this 
counter to be the same. And you should 

263
00:19:59,590 --> 00:20:04,192
size it big to take into account the 
entire round trip latency of this, this 

264
00:20:04,192 --> 00:20:08,262
communication. 
lets stop here for today. 

265
00:20:08,262 --> 00:20:11,296
I'll briefly finish up deadlock next 
time. 

266
00:20:11,296 --> 00:20:17,219
And then, we'll move on to building large 
mini core systems or large multi-core 

267
00:20:17,219 --> 00:20:21,120
systems, and coherence protocols for them 
next lecture.