1 00:00:03,680 --> 00:00:07,600 Okay, so that's high level programming models. 2 00:00:08,900 --> 00:00:14,371 Or our, our second high level programming model, which is not, shared memory, 3 00:00:14,371 --> 00:00:19,660 and the contrast with shared memory, of messaging or message passing. 4 00:00:19,660 --> 00:00:25,462 And today we're going to start talking a little bit about the gory details on how 5 00:00:25,462 --> 00:00:30,782 to build interesting interconnects. Before we get started, I wanted to 6 00:00:30,782 --> 00:00:34,124 introduce, sort of the agenda that we are going to talk about. 7 00:00:34,124 --> 00:00:38,343 We are not going to cover all of this today. We're, this is, today's lecture 8 00:00:38,343 --> 00:00:41,740 and next lecture we will be talking about interconnect design. 9 00:00:42,900 --> 00:00:47,850 So a couple of different, things that you have to worry about in interconnect 10 00:00:47,850 --> 00:00:52,652 design. First one is what's called switching. 11 00:00:52,652 --> 00:00:58,853 So when I say switching, what I'm trying to say by that is, there's different ways 12 00:00:58,853 --> 00:01:04,650 to organize the communication. And you can think about his 13 00:01:04,650 --> 00:01:09,994 probably the best example is the old telephone system versus the modern day 14 00:01:09,994 --> 00:01:13,370 Internet. So in the old telephone system, there's 15 00:01:13,370 --> 00:01:18,855 switching, the way it worked is, you try to make a phone call and the first you 16 00:01:18,855 --> 00:01:24,200 did is you picked up the, the receiver on the phone, and you sort of turn the 17 00:01:24,200 --> 00:01:29,474 crank, and there is some switch board operator in your local town who would 18 00:01:29,474 --> 00:01:34,678 hear the bell ring, and take the plug, plug the wire in and pick up and say 19 00:01:34,678 --> 00:01:41,632 hello, where, who you trying to call. And you would say I'm trying to call such 20 00:01:41,632 --> 00:01:47,665 and such in next town over. Well the switchboard operator would say, 21 00:01:47,665 --> 00:01:52,805 okay that's good, I'm going to take a wire from, the wire that runs from your 22 00:01:52,805 --> 00:01:57,133 house to this location, plug one wire in there and then we'll go 23 00:01:57,133 --> 00:02:01,800 and plug the other wire into a wire which runs to the next town over. 24 00:02:02,940 --> 00:02:07,138 So then you'd be connected to the next town over, and the switchboard operator 25 00:02:07,138 --> 00:02:11,337 in the next town over, and you'd say, oh I'd like to talk to the same person 26 00:02:11,337 --> 00:02:14,675 you're trying to talk to. And they would say, okay that's great. 27 00:02:14,675 --> 00:02:18,927 I'm going to patch a wire from, my patch board from, the out of town wire, to the 28 00:02:18,927 --> 00:02:23,018 destination location, and I'll turn a little crank to ring the, the ringer on 29 00:02:23,018 --> 00:02:26,712 the, on the destination phone. Okay, so that's one idea. 30 00:02:26,712 --> 00:02:31,877 And that is actually called circuit switch networks, because you're building 31 00:02:31,877 --> 00:02:35,321 a circuit. You're building a wire all the way from 32 00:02:35,321 --> 00:02:38,420 one location, all the way to another location. 33 00:02:40,100 --> 00:02:44,991 Alternatively we can do what we do in the network internet today. 34 00:02:44,991 --> 00:02:50,674 In the internet today there is, we don't generate, we don't have a wire directly 35 00:02:50,674 --> 00:02:56,501 from one location patched together the, you'd actually patch wires from one place 36 00:02:56,501 --> 00:03:00,602 to another place. in contrast, in our switching systems 37 00:03:00,602 --> 00:03:05,434 you'll actually packetize the data. And then we'll hand the packets from 38 00:03:05,434 --> 00:03:09,251 different locations, and people and resources can be used for 39 00:03:09,251 --> 00:03:12,629 other things. So, it's much more akin to something like 40 00:03:12,629 --> 00:03:15,883 a mail system. So, in a mail system, you take the data 41 00:03:15,883 --> 00:03:18,072 you want to send. You fold it in half, 42 00:03:18,072 --> 00:03:21,075 you put it in an envelope. You put a stamp on it, 43 00:03:21,075 --> 00:03:25,068 and you, you set it going. The road that goes from your house to the 44 00:03:25,068 --> 00:03:29,058 post office is not reserved, for that one piece of mail, until it gets all the way 45 00:03:29,058 --> 00:03:31,866 to the destination, like in a circuit switched, to apology. 46 00:03:31,866 --> 00:03:35,709 Instead in a packet switched, you are going to some of the message, and you are 47 00:03:35,709 --> 00:03:39,551 to put a stamp on the physical envelope, and the, your postal carriers may come 48 00:03:39,551 --> 00:03:41,866 pick it up. They are going to take that, take it to 49 00:03:41,866 --> 00:03:45,708 some place else, and then there is taken to some place else, take to some place 50 00:03:45,708 --> 00:03:49,501 else, but you can set another piece of mail on that same link, or other people 51 00:03:49,501 --> 00:03:52,950 can be using the same roads to send another packets, or other messages. 52 00:03:52,950 --> 00:03:58,326 So switching is the type of network we have here, and how we connect together 53 00:03:58,326 --> 00:04:02,097 the different networks and how we do the switching. 54 00:04:02,097 --> 00:04:07,194 And there's a couple other things in the middle between circuit switched and 55 00:04:07,194 --> 00:04:12,274 packet switched networks. Okay, so next thing we are talking about 56 00:04:12,274 --> 00:04:16,348 interconnect design is topology. So this, how do we physically connect 57 00:04:16,348 --> 00:04:20,421 things together, in our world. How do we run wires between the different 58 00:04:20,421 --> 00:04:23,864 nodes in the system? Do we run wire between every person in 59 00:04:23,864 --> 00:04:27,822 the classroom, I run a wire between myself and every other person, and 60 00:04:27,822 --> 00:04:30,576 everyone runs with a wire between everyone else. 61 00:04:30,576 --> 00:04:33,617 Or do we build a nod in the middle of everything to, 62 00:04:33,617 --> 00:04:38,092 or do we really connect to our nearest neighbours, and then, we have to send it 63 00:04:38,092 --> 00:04:42,395 to them in native sens of X people. So there's a lot of different apologies 64 00:04:42,395 --> 00:04:46,565 for given size graph. [COUGH]. 65 00:04:46,565 --> 00:04:53,372 Routing, routing is figuring out what path to take through the network to get 66 00:04:53,372 --> 00:04:56,417 from one point to another. So, we can actually build a nearest 67 00:04:56,417 --> 00:05:00,390 neighbor network in this classroom, or we run wires between all of our neighbours, 68 00:05:00,390 --> 00:05:04,415 but not lets say, we all, each, each of us connects to three different people, 69 00:05:04,415 --> 00:05:08,234 but not to four, and there's going to be multiple paths from any one point to 70 00:05:08,234 --> 00:05:10,762 another point. So we need to come up with a routing 71 00:05:10,762 --> 00:05:14,426 decision to figure out how to get from that one point to another point, 72 00:05:14,426 --> 00:05:18,296 and that affects our interconnect design. And then finally flow control, and 73 00:05:18,296 --> 00:05:21,960 there's two types of flow control. There is local flow control, which is 74 00:05:21,960 --> 00:05:28,364 communication from one node, to the next the next node over, and making sure that 75 00:05:28,364 --> 00:05:34,372 you don't lose data on that local link. But its also, how do you rate limit, 76 00:05:34,372 --> 00:05:41,093 round trip throughout the entire network. and we are going to study that next time 77 00:05:41,093 --> 00:05:43,075 in more depth. Okay. 78 00:05:43,075 --> 00:05:46,822 So let's move on to some fun pictures here. 79 00:05:46,822 --> 00:05:52,583 High quality pictures. So first thing we're going to look at 80 00:05:52,583 --> 00:05:57,094 here is the anatomy of a message. . 81 00:05:57,094 --> 00:06:04,820 A message is our fundamental primitive of some piece of data we want to send. 82 00:06:04,820 --> 00:06:11,030 At the top here, we show a message which has, let's say, some number of bytes in 83 00:06:11,030 --> 00:06:14,136 it. These delimiting points here do not 84 00:06:14,136 --> 00:06:17,878 delimit a byte, they delimit some chunk of data. 85 00:06:17,878 --> 00:06:24,407 And we're actually going to fragment this message into smaller pieces which we're 86 00:06:24,407 --> 00:06:28,789 going to call a packet. Now I know that I want to make a big 87 00:06:28,789 --> 00:06:31,548 difference between a message and a packet. 88 00:06:31,548 --> 00:06:36,933 A packet is a piece of data that we're going to be sending through the network, 89 00:06:36,933 --> 00:06:42,581 and the network natively understands this packet and the packet is routable through 90 00:06:42,581 --> 00:06:45,340 the network, so it has routing information. 91 00:06:49,200 --> 00:06:53,680 So, a good example of this is, if you are trying to use MPI, 92 00:06:53,680 --> 00:06:58,697 and you want to send 1000 words, and let's say the maximum message you can 93 00:06:58,697 --> 00:07:03,777 and, receive, or maximum package you can send on a network is 100 bytes. 94 00:07:03,777 --> 00:07:08,678 You're going to packetize the data, packetize the message into 100 95 00:07:08,678 --> 00:07:13,843 individually roundable packets. So let's look inside of one of these 96 00:07:13,843 --> 00:07:16,588 packets. Typically, because the packet is 97 00:07:16,588 --> 00:07:19,126 roundable, it needs to know where its going. 98 00:07:19,126 --> 00:07:21,723 It might need to know where its coming from. 99 00:07:21,723 --> 00:07:26,031 So at the beginning of a packet, you will typically have something like a source. 100 00:07:26,031 --> 00:07:30,635 You might have a destination, or its going to be you definitely need to have a 101 00:07:30,635 --> 00:07:36,555 destination, you might have a source. You might have a length. 102 00:07:36,555 --> 00:07:40,200 If your, network allows for variable length packets. 103 00:07:41,260 --> 00:07:46,175 [COUGH], and we call this the header. And we call the rest of the packet the 104 00:07:46,175 --> 00:07:49,790 payload. [COUGH] And we're going to introduce an 105 00:07:49,790 --> 00:07:58,540 interesting term here. Flit Now this is a term actually that was 106 00:07:58,540 --> 00:08:05,305 coined by, I believe Bill Dally, who is now a Stanford professor who, he did a 107 00:08:05,305 --> 00:08:11,713 lot of work in message passing and other types of parallel computers. 108 00:08:11,713 --> 00:08:17,766 A flit is a flow control digit. It's kind of like a bit, but it's the, 109 00:08:17,766 --> 00:08:25,920 the, flow control ball unit and. The reason we bring this up, sometimes a 110 00:08:25,920 --> 00:08:28,479 flit is actually equal to the whole packet. 111 00:08:28,479 --> 00:08:33,061 Sometimes it's a smaller piece of the packet, depending on how you build your 112 00:08:33,061 --> 00:08:34,013 network. [COUGH]. 113 00:08:34,013 --> 00:08:40,752 But a flow control digit is what you were flow controlling on from one node to the 114 00:08:40,752 --> 00:08:46,207 next node. So this is what you track the flow 115 00:08:46,207 --> 00:08:51,075 control on. So, it's very possible that your, you 116 00:08:51,075 --> 00:08:56,681 don't actually track, sort of send and receiving of messages on 117 00:08:56,681 --> 00:09:02,358 the byte level, or on every single cycle, but instead you do it in bigger chunks. 118 00:09:02,358 --> 00:09:08,107 So a good example of this is, you have a network, which is one byte wide, the link 119 00:09:08,107 --> 00:09:13,640 is one byte wide, but you are, so, the minimum thing you can send is a 32 bit 120 00:09:13,640 --> 00:09:18,942 word. As the minimum piece of data that is full 121 00:09:18,942 --> 00:09:21,906 control, then you need to always send let's say, 122 00:09:21,906 --> 00:09:27,162 four bytes. The full control will be on the order of 123 00:09:27,162 --> 00:09:34,935 the flit, which is the four byte unit. But the link is narrower than that. 124 00:09:34,935 --> 00:09:39,891 So you don't actually, you can't actually stop in the middle of the message, and 125 00:09:39,891 --> 00:09:41,584 say. Oh, I only got 126 00:09:41,584 --> 00:09:45,160 or you can't stop in the middle of a flit and say stop. 127 00:09:45,160 --> 00:09:49,237 you gave me three bytes, and I can't take the fourth right now. 128 00:09:49,237 --> 00:09:50,492 No. It's not allowed. 129 00:09:50,492 --> 00:09:55,326 It's, it's flow control based. So the flow control says this is the 130 00:09:55,326 --> 00:09:59,218 minimal unit for flow control. So there's four bytes, that is what you 131 00:09:59,218 --> 00:10:02,265 are allowed to send. And you need to send in chunks of 132 00:10:02,265 --> 00:10:05,400 basically four bytes. And that's our flit size. 133 00:10:05,400 --> 00:10:12,940 No we can split inside a flit, and we can actually call this one a phit, 134 00:10:12,940 --> 00:10:18,477 or a physical transfer digit. And a phit is what I was talking about 135 00:10:18,477 --> 00:10:24,262 with, you had, if you had for instance, four bytes that you're trying to 136 00:10:24,262 --> 00:10:28,643 transmit, and your flow control on those four bytes. 137 00:10:28,643 --> 00:10:33,446 Each of those bytes is a phit, or the physical transfer that you 138 00:10:33,446 --> 00:10:35,720 transfer in one cy-, in one clock. cycle. 139 00:10:37,620 --> 00:10:40,719 [COUGH]. Many times the phit and the flit will be 140 00:10:40,719 --> 00:10:44,981 the same, if you have what, if you have wide networks, but if you have very 141 00:10:44,981 --> 00:10:48,340 narrow networks, sometimes these will not be matched. 142 00:10:49,700 --> 00:10:59,450 Okay, it's just a nomenclature so far. Our first topic we want to discuss is 143 00:10:59,450 --> 00:11:02,765 switching, how to get from point A to point B. 144 00:11:02,765 --> 00:11:04,268 Oh, okay. It's not routing, 145 00:11:04,268 --> 00:11:08,720 it's it's rather the model of how to connect different locations together, 146 00:11:11,240 --> 00:11:13,897 and we're going to talk mostly about three here. 147 00:11:13,897 --> 00:11:17,994 We already talked about circuit switched. Circuit switched is like the old 148 00:11:17,994 --> 00:11:21,316 telephone network. You pick up the phone and somehow there's 149 00:11:21,316 --> 00:11:25,690 a wire patched all the way from where you pick up the phone to the destination 150 00:11:25,690 --> 00:11:28,403 location, and you reserve that location the whole 151 00:11:28,403 --> 00:11:33,040 time. Packet switched networks, 152 00:11:33,040 --> 00:11:36,844 or what is sometimes called store and forward networks or probably more 153 00:11:36,844 --> 00:11:39,980 commonly called store and forward networks, 154 00:11:39,980 --> 00:11:43,726 are things like the Internet, where you'll actually generate a packet, 155 00:11:43,726 --> 00:11:46,278 hand that to someone else, and they'll store it. 156 00:11:46,278 --> 00:11:50,458 And at some point, when the link is free, they'll forward it onto the next hop. 157 00:11:50,458 --> 00:11:54,096 And when the next link is free, they'll store it on to the next hop, 158 00:11:54,096 --> 00:11:57,560 and it'll continue until it gets to the destination. 159 00:11:57,560 --> 00:12:02,420 160 00:12:02,420 --> 00:12:10,460 Thus in contrast to cut through networks, which are sometimes called wormhole 161 00:12:10,460 --> 00:12:14,260 networks. Cut through networks still have 162 00:12:14,260 --> 00:12:19,049 packetization, but they'll actually worm through the 163 00:12:19,049 --> 00:12:22,136 network. So instead of having to send the data 164 00:12:22,136 --> 00:12:24,753 from one location, to the next hop over, 165 00:12:24,753 --> 00:12:30,120 and it waits for the entire message to receive before it sends it on to the next 166 00:12:30,120 --> 00:12:33,073 node. A wormhole network will actually allow, 167 00:12:33,073 --> 00:12:38,575 or a, a cut-through network will actually start to send the beginning portion of a 168 00:12:38,575 --> 00:12:43,876 message from one node to the next node along the hops before the tail has been 169 00:12:43,876 --> 00:12:47,930 received. Hence it's actually worms through the 170 00:12:47,930 --> 00:12:50,780 network.. You can actually have a message sort of 171 00:12:50,780 --> 00:12:53,900 streamed through a multiple different nodes, 172 00:12:53,900 --> 00:12:57,571 and this is where we have as cut-through networks. 173 00:12:57,571 --> 00:13:03,586 So I just want to introduce those, those three ideas, and think about them as we 174 00:13:03,586 --> 00:13:08,350 go on to build networks. we'll talk about cut-through networks, 175 00:13:08,350 --> 00:13:20,340 when we get to, rounding. So before we break for today 176 00:13:20,340 --> 00:13:23,405 I want to just introduce a couple different topology. 177 00:13:23,405 --> 00:13:28,032 And just flash and come up with different topologies, where we'll pick up next 178 00:13:28,032 --> 00:13:30,404 time. But in our networks, we talked about 179 00:13:30,404 --> 00:13:33,006 buses. Multiple entities on one shared medium. 180 00:13:33,006 --> 00:13:37,634 You can think about building a segmented bus, where you actually have flip flops 181 00:13:37,634 --> 00:13:41,856 along the communication path here. And this would allow, for instance, this 182 00:13:41,856 --> 00:13:45,731 node, maybe, to communicate with that node at the same time as three 183 00:13:45,731 --> 00:13:51,140 communicating with four. sometimes this is called pipeline bus. 184 00:13:52,300 --> 00:13:58,641 You can have rings, which are like the segmented bus, but 185 00:13:58,641 --> 00:14:05,116 they connect the two ends together. You could even implement rings in a way 186 00:14:05,116 --> 00:14:09,684 that minimizes wire length. We'll talk about that next time. 187 00:14:09,684 --> 00:14:14,500 You could have crazier things like 2D meshes, 2D toruses, 188 00:14:14,500 --> 00:14:22,865 if you have cubes and hyper cubes. You can have fully connected topologies. 189 00:14:22,865 --> 00:14:28,384 You can have things called omega networks, which are multi stage networks, 190 00:14:28,384 --> 00:14:34,122 where you can basically communicate from anyplace to any other little place in 191 00:14:34,122 --> 00:14:37,100 multiple stages, in multiple clock cycles. 192 00:14:37,100 --> 00:14:41,675 You can build, trees, or, things called fat trees, where the 193 00:14:41,675 --> 00:14:46,020 lengths at the top of the tree get fatter or wider. 194 00:14:46,020 --> 00:14:51,084 And we'll talk about this next time, but the generalized case of this meshes, you 195 00:14:51,084 --> 00:14:56,148 end up with things that are called k-ary n-cubes where you can very concisely with 196 00:14:56,148 --> 00:15:00,480 two numbers, describe a network topology. Okay, let's stop here for today.