Okay. So now, let's briefly touch on routing protocols. There's lots of different routing protocols, protocols out there. And roughly let's, let's split them into two different categories. Oblivious routing protocols where the routing path through the network is independent of the state of the network. And the second one is adaptive routing. So these are sort of smarter networks. They look for they somehow look at the state of the network, and make some decisions. And if you build a good adaptive routing network, what you're going to try and do is look for the hot spots in the routing network, and route around them. And this is where that path independence, or multi path networks comm, commit. If you only have one route from any point to any other given point, the routing is basically deterministic. You can't, can't go route in the wrong, you can't go route someplace else because you'll never be able to get to your destination. Where this becomes interesting is if you, if you look at just the basic mesh network. Our basic mesh network has made it from points between, or many different routes between any two points. So, if we have let's say bre, four by four grid you can just sort of go across the top and then along the side, let's say if you're going from the top to the bottom or you can sort of zigzag along the middle or go the other way. Go Y and then X, or X and then Y, just lots of different choices along, along there. So that's where we start to get to decide about different routing networks. under oblivious newtorks, there's, there's two sort of sub points here. One is deterministic which basically says, do the same thing all the time, versus a non-deterministic design. And, you might say, what is, what is a non-deterministic oblivious routing algorithm? Well, it's typically a randomized routing algorithm. And these can actually do slightly better than the deterministic oblivious routing algorithms. So let's take an example of this non-deterministic, but still doesn't look at the state of the network. So, let's, an example of this is something where you, let's say, choose some node in the system at random and you route there first. And then, once you've routed there, you route X and then Y to the destination. I would say, why would you want to do this? Well, it's actually sort of scatters your traffic around the network and actually route in different directions and potentially could increase the delivered aggregate bandwidth through your network by doing this randomized choice at the beginning where you choose some other node. Which might be routing a wrong, the wrong direction first, but then you'll have more aggregate balance, let's say to a destination. So, you can actually have non-deterministic oblivious routing. [COUGH] Adaptive routing, I'm going to sort of not talk about in this class because there's a, it's a very active research area and I recommend you take 588 if you want to learn more about that because that's, that's a very broad topic where you try to look for congestions network and route around the congestion. And if you try to do that, you have to make sure you're network doesn't deadlock or have other bad conditions that happen. And to some extend usually these networks trade off let's say ease of analysis at least for performance. I wanted to talk about one quick deterministic oblivious routing algorithm on a mesh. So let's, let's draw a mesh here. [SOUND] Okay. So, we have our sixteen node, four array, two cube here. And we wan to route let's say from A to B. The most basic, or one of the more basic routing algorithms that we're going to be looking at here we're going to call dimension ordered routing. [COUGH] And it's called dimension ordered routing because there's actually two dimensions here. We'll label this as let's say X increasing, we'll label this as Y increasing. And, [COUGH] and dimension order routing says that we order the dimensions in which we route. So, we always route it in one of the dimensions before we go to route in the other dimension. So, if we want to get from here to here, and let's say we have X then Y is our routing, we're going to route this way, from A to B first. So, now we've matched our X location to where B is, and then we're going to route down. So, this is X and Y. This is oblivious. It doesn't matter about the traffic. And it has some bad problems that show up if you do this. [COUGH] If we have let's say C and D, and it's trying to route from C to D, it needs to share this link with traffic going from A to B. And it can't route around it. It can't like, I don't use those links or something else. It's just not allowed in our oblivious routing algorithm, and definitely not allowed in our dimension ordered. Now, dimension ordered, there's networks that are, let's say, X and then Y and you can also have, let's say, Y and then X. You can, this also generalizes to three dimensional n-dimensional sort of spaces. You do, let's say X, then Y, then Z. Or W, then X, then Y and then Z or something like that in a four dimensional cube. [COUGH] But one of the good properties of dimension order routing is it's not going to deadlock. And we'll talk about that in a few slides. But the network itself is never going to deadlock. That doesn't mean you can't get message dependent deadlock. But if you wormhole through the network here, you can actually come up with a proof that shows that [COUGH] you will not ever deadlock this network. And the, the rough sketch of the proof look goes like this. [COUGH] By only using X links first and then using Y links, [COUGH] you only ever acquire an X link, and then acquire a Y link, and then release an X link, and then release a Y link. So, you'll never actually grab a resource that you have needed, you have needed before. And everyone else is doing that same algorithm so they will never try to grab a resource in order that is, is, is faulty also. So, if we go back to our sort of dining philosophers problem here, you're guaranteed that you're going to acquire resource one and then acquire resource two. And everyone's doing that same algorithm so they will always acquire resource one or block, waiting to acquire resource one and then you acquire resource two. But you'll never see a cycle. So, because of that, you'll next acquire X, and then Y, and then X in this dimension order of routing. So, you'll never get a cycle in your dependency graph. That's the rough sketch. And the, the, the full sketch is a little more complicated. Usually, you have these two dimensional mesh networks without end around, or without a taurus. You have to sort of prove it in this direction and in that direction. And then, sort of it, decomposes into sort of X from west to east, and X from east to west, and vice versa. But, that's the rough sketch of the proof. But, I just want to get across that. It's a basic routing algorithm, but there's lots of other choices here. as I sort of said there was this non-deterministic one, there's adaptive routing algorithms. So, a good adaptive one, sometimes called hot potato routing, is that if you, you don't have any buffering in the network, so you can't buffer along the way. If you get to a node which has a link which is congested, you just choose randomly some other direction, and you route that direction. [COUGH] And the, the thinking here is that some point you will actually be able to choose again and route towards the right direction and slowly you'll move your, your traffic or your packet will move to the end point. But, it allows you to sort of go around traffic congestion points. But, I don't want to go into too much detail on this because this is a whole lecture in itself. Okay. So, let's talk about flow control. This is a aspect of networking that I really enjoy, actually. So, flow control is making sure that you don't drop data. Making sure that you have back pressure in your networks so that you don't lose, lose data in your network. And, this flow control can either be local on a link by link or hop by hop basis, or it could be round trip. So a good example of round-trip flow control is going back to something like TCP/IP. You actually want to rate-limit the round-trip flow control so you don't flood the network if a given link is, is too small. So there's actually a protocol in there which will rate-limit the round-trip flow control. Or a good example, actually, on a on-chip network, so something like a mini-core, [COUGH] is if you're trying to guarantee that you do not overrun a buffer in the memory controller. But lots of different nodes are trying to route to the memory controller. You can actually have some sort of counter, which effectively is a round trip flow control to the far away end point node and say, okay, you are allowed to have five outstanding memory transactions to the memory controller. But if you try to send a sixth, you need to wait for the app to come back. So, this is a end to end or long distance flow control, but each of the hops along network themselves may have localized full control also. So, you can actually layer these two things together. And typically, sort of, for on chip networks, we would like some properties and even in computer system networks, we typically like the networks to be resilient. so we don't want sort of data to be dropped. But that's not, not necessarily guaranteed. People have built networks where links can drop, can drop data. So that's another option is, you can drop the data and try to retransmit it if you want reliable transmission from one point to another. Okay. So, let's look at a quick example here of flow control in our networks. This is not in your handouts. so you just, pay attention. so the first type of flow control we're going to look at is basically a stall wire, or a big stop sign. So here or is what we have, is we have a sender, with a Q. So Q sub sender and Q sub R, or Q sub receiver. And we've named sort of all the points along here. let's say, in this network we have a register in between the sender and the receiver and this is in our link. This is sort of our link cost. Takes a cycle and we pipeline the wire. [COUGH] Okay. So now, data arrives here into D, which is our sort of receive point into the receiver. And at some point it decides, I can't take any more data. I can't, can't read that data. I need to go process some of it. So, I need to stall the network. So it says, stop. So it says stop here. And in a perfect world, this goes through some combinational logic. Just completely, just like in your pipeline stalls this point in the pipe stalls. This flip flop stalls that Q, stalls the, going this way. Stalls A, stalls the sender, tells it not to actually send any more data. So it's a stall. We stop in D. C stops, B stops, A stops instantaneously. At some point, D unstalls because it's going to go read data more again and the network keep flowing again. So, sorry. What this is showing is, these are packets. These are not FITS. These are complete packets getting sent, or complete FLITS getting sent. This is our flow control unit. So we're looking at each one of these A, B's, and C's here, and D's are, are uniquely flow controllable units. So that's on/off flow control for combinational stall single. Now, the problem with this is, it took us a cycle to send data this wire because we had a flip flp going that way, but our stall wire runs all the way back. well, that's not, that's not good from a cycle time perspective. because if you look at the combinational sort of path now, you're going to see well, maybe you have an output of this which says this data available that goes through some logic in D and that generates a stall signal. And then, has to propagate physically a long distance away and go into lots of other logic. And after having designed a couple of these sorts of networks, you're never going to be able to build that. That's, if you want to have a high performance network, that time is, is, is, is too long. because if your cycle time is anywhere near the length of wire, of the delay of length wire you have, you're not going to be able to build them. Okay. So now, we start to look at, we, we switch to something and say, let's try to solve that and put a pipeline flop in here on the receive on the return path of the stall wire, of the stall signal. Okay. So, what, what happens? Well, if we look at this for packet zero, one, two, three, at some point d decides to stall. But, [COUGH] that is not able to stop let's say, R from sending the cycle before. Because we now have a flip flop in this path. So, it doesn't know about that. Likewise, it doesn't know, it can't stop B and A from moving forward one cycle either. So, if we look at the pipeline diagram, what that really means is these arrows here, we can't stall this on a dime. We have to wait and we actually are not able to stall until the next packet. Likewise, we're not able to stall B and A until later. And this can even be worse if you have multiple pipeline registers going backwards here, which is pretty common. Now, the implication of this is, you were not able to stall C at all. So, like it or not, the data's coming this direction from our D, Q sub R. Huh. So, what do you go about doing at this point? So, this is a tough one. you could either drop the data. That's probably not very advantageous because you're just going to drop random data in your network. Networks like to actually deliver data. A better thing to do is to have extra buffering in Q sub R, and we'll call this skid buffering. So that the data, it's called skid buffering because it's kind of like in a car. When you hit the stop pedal or you hit the, hit the, the brake pedal, you need some runway to, to skid into before you can actually stop. So, you hit the brake, and you're a sort of skid to a stop and you need that, that extra area. So, we budget that into this Q here and you can actually build a network build a flow control system where you have extra space in Q sub R to skid into. And then when you unstall, you'll see actually that there's basically that should be a D, sorry. it takes a cycle to unstall also. So that's, that's kind of not good. Conveniently, you have something to feed at that point because we have that extra value in Q sub R. But there are cases that you can come up to with, where you stall and then not stall. And then stall and then not stall. Where you can actually accidentally introduce bubbles with this type of flow control. And I don't have a, a diagram of that here. But that would show up as basically a bubble throughout the entire pipe and no data being read. But that's, you can construct that with a sort of on/off flow control pipe line flow control. [COUGH] So, you could also think about having on/off flow control of partial pipeline stall control. So what I mean by that is, instead of feeding up into this register here, now we need two skid buffer entries over there because we can't stop B flowing into here or C flowing into there. So we can't stop any of this stuff and we need two entries now in that skid buffer. And we're running out of time, but I wanted to try to finish this flow control notion up here. A good solution to this is called Credit-Based flow control. So, in credit-based flow control, you put a counter in the center and the counter counts how many entries there are in the receive FIFO. And what it does is it starts sending data. And whenever data is removed from this FIFO, a credit gets sent back. And it can be, if, like it can be pipelines multiple levels deep here on the return, point. When the credit comes back, you increment the counter. But when you send a word, you decrement the counter. And if the counter ever reaches zero, you stop sending. This is actually ends up being superior than this order design because you don't have funny stutters on the receive side when you go to build a credit-based flow control system. That's a little bit beyond, showing that proof is a little beyond this course. But I did want to say here that when you compute this, you need to compute how large the credit needs to be and how big this receive FIFO needs to be. And they need to be sized appropriately. [COUGH] If you size one of them wrong, in this case, if you size one of them wrong in the on/off flow control, you're actually going to end up losing data. If you skid into your skid buffer and you need two skid entry slots, and you only have one skid entry slot, you're going to lose data. In the credit-based flow control system, the counter just reaches zero. So, just impact your bandwidth, but you're not going to lose data. Now, having said that, you should, if you want high performance through this network, you should size this and this counter to be the same. And you should size it big to take into account the entire round trip latency of this, this communication. lets stop here for today. I'll briefly finish up deadlock next time. And then, we'll move on to building large mini core systems or large multi-core systems, and coherence protocols for them next lecture.