1 00:00:03,064 --> 00:00:06,098 In order issue. Out of order or excuse me. 2 00:00:06,098 --> 00:00:11,070 In order front end, out of order issue. Out of order write back. 3 00:00:11,070 --> 00:00:16,057 And in order commence. So, the middle portion of the pipe here is 4 00:00:16,057 --> 00:00:20,091 all out of order. And then finally, we commit an order and 5 00:00:20,091 --> 00:00:26,010 we fetch an order. So this has all of those structures that 6 00:00:26,010 --> 00:00:29,029 we had before. It's sort of the union of everything. 7 00:00:29,029 --> 00:00:32,550 We have, issue Q. We have future store buffer, reorder 8 00:00:32,550 --> 00:00:36,768 buffer, physical register file. The scoreboard, an architectural register 9 00:00:36,768 --> 00:00:40,611 file. This requires us to have everything. 10 00:00:40,611 --> 00:00:46,824 And we can start thinking about what this does to performance. 11 00:00:46,824 --> 00:00:53,718 So, I'm gonna, I'm gonna push through here because I only have two more slides and, 12 00:00:53,718 --> 00:00:56,064 the state. We get lost otherwise. 13 00:00:56,064 --> 00:00:57,032 Okay. So. 14 00:00:57,096 --> 00:01:01,077 Let's, let's see some interesting things happening here. 15 00:01:01,077 --> 00:01:06,084 So we have out of order issue. So we can see this add here issuing before 16 00:01:06,084 --> 00:01:10,059 this other before this multiply. That's pretty cool. 17 00:01:12,148 --> 00:01:23,609 Ignore this bottom for a second here. We have that same problem sort of showing 18 00:01:23,609 --> 00:01:27,068 up here. We have this right happening, but even an 19 00:01:27,068 --> 00:01:33,070 out-of-order issue processor. This, this should be able to get pulled 20 00:01:33,070 --> 00:01:37,998 back, as I said before. But it doesn't, because at this point 21 00:01:37,998 --> 00:01:42,382 you'd have a, right hazard, you'd have a hazard on, on the right back of the 22 00:01:42,382 --> 00:01:45,085 register file. Something similar if you try to issue 23 00:01:45,085 --> 00:01:49,005 here, you'd be issuing two instructions at the same time. 24 00:01:49,005 --> 00:01:53,010 So, so this starts to be a problem. So, it actually ends up being pulled out. 25 00:01:53,010 --> 00:01:57,066 Interestingly enough the performance of this, is, is not a whole better than what 26 00:01:57,066 --> 00:02:01,264 we had before If you sort of like cut it here at fifteen cycles, the commit gets 27 00:02:01,264 --> 00:02:04,085 pushed out far because you have to commit in order. 28 00:02:04,085 --> 00:02:08,653 But this fixes a lot of problems that we had in the in order fetch, out of order, 29 00:02:08,653 --> 00:02:12,139 out of order, out of order. Because we can have, precise exceptions at 30 00:02:12,139 --> 00:02:15,997 the end of a pipe. We can have out of order issue, out of 31 00:02:15,997 --> 00:02:20,722 order execute. We could still do in order fetch because, 32 00:02:20,722 --> 00:02:30,676 that's kind of symantics of programs. So, Let's, let's take a look at let's say 33 00:02:30,676 --> 00:02:36,069 we had the ability to do double issue but not double execute. 34 00:02:37,013 --> 00:02:40,093 How does this change? We actually, if I change this to this 35 00:02:40,093 --> 00:02:44,027 diagram. And as you can see here we actually have I 36 00:02:44,027 --> 00:02:49,044 here and I here in the same time period. So we pull that back one and that, that, 37 00:02:49,044 --> 00:02:54,091 you would think that would actually help. But, you know so we don't, we don't have a 38 00:02:54,091 --> 00:02:57,088 right conflict everything is still kind of okay here. 39 00:02:57,088 --> 00:03:00,046 But the commit still happens at the same time. 40 00:03:00,046 --> 00:03:03,015 So thats not always, always as good as you think. 41 00:03:03,015 --> 00:03:11,856 In reality what you want to start thinking about is having, out of order and with. 42 00:03:11,856 --> 00:03:17,913 So here we have a, out of order two wide superscalar. 43 00:03:17,913 --> 00:03:25,846 What we showed before in order fetch, out of order issue out of order right back and 44 00:03:25,846 --> 00:03:29,637 in order commit. And we can see is we are actually fetching 45 00:03:29,637 --> 00:03:33,241 two instructions at a time, decoding two instructions at a time. 46 00:03:33,241 --> 00:03:37,509 Issuing two instructions at a time and this, this can actually help a little bit 47 00:03:37,657 --> 00:03:40,709 but you still have problems. So here we're just going to have. 48 00:03:40,709 --> 00:03:45,167 We can only, we'll build an issue two but we're not going to have two ALU's, we're 49 00:03:45,167 --> 00:03:50,033 going to have the same sort of back end of the pipe, and what we start to get limited 50 00:03:50,033 --> 00:03:53,720 by, is we end up with sort of execution resource bottle necks here. 51 00:03:53,720 --> 00:03:58,069 So, next time were gonna start talking about how to sort of add multiple ALUs, 52 00:03:58,069 --> 00:04:02,144 and you can sort of pull this earlier. And, maybe you can have two multiplies or 53 00:04:02,144 --> 00:04:05,637 something like that and try to remove some of those complexities. 54 00:04:05,637 --> 00:04:09,098 But what's nice about this is, if you have double issue. 55 00:04:09,098 --> 00:04:13,878 Out of order right back. You know this ad instructions that are not 56 00:04:13,878 --> 00:04:16,857 dependent on these malls at all, can just happen. 57 00:04:16,857 --> 00:04:21,033 And that's really nice. Okay we're going to stop here for today.