1 00:00:03,068 --> 00:00:04,090 Okay. So let's. 2 00:00:04,090 --> 00:00:10,043 We're, we're almost to the end here le, of, of control, hazards. 3 00:00:10,043 --> 00:00:16,057 Let's talk about why an instruction may not be dispatched every cycle. 4 00:00:17,053 --> 00:00:24,081 Well. Let's, let's think about forwarding and 5 00:00:24,081 --> 00:00:30,080 full bypassing. This is sometimes really expensive to add. 6 00:00:30,098 --> 00:00:35,013 If you are gonna bypass every location in your pipe, that may be expensive. 7 00:00:35,013 --> 00:00:39,030 So, we may still want to stop in certain cases. 8 00:00:39,084 --> 00:00:45,783 A good example of this is, if you go look at a modern day, something like your core 9 00:00:45,783 --> 00:00:49,541 I7 machine, they actually don't bypass between all the different functional 10 00:00:49,541 --> 00:00:53,387 units, from all the different locations, because they have, can execute about six 11 00:00:53,387 --> 00:00:58,027 instructions per cycle. And there, they have many stages in the 12 00:00:58,027 --> 00:01:02,017 depth of their pipe, so they'd have to basically be bypassing at a hundred 13 00:01:02,017 --> 00:01:05,020 different places for every new source off branch. 14 00:01:05,020 --> 00:01:11,001 So, what you typically will do, is you'll figure out, what are the common bypasses 15 00:01:11,001 --> 00:01:16,081 that are needed, are the common forwarding paths that are needed and you'll have 16 00:01:16,081 --> 00:01:19,035 those. And then some of the infrequently used 17 00:01:19,035 --> 00:01:25,008 ones, you just won't build. This will help with your cycle time, but 18 00:01:25,008 --> 00:01:33,718 hurt with your CPI. Loads, Can have a, or, typically have a 19 00:01:33,718 --> 00:01:38,172 2-cycle latency. So we talked about this when we were 20 00:01:38,172 --> 00:01:42,808 talking about, load to use, and the instruction after the load. 21 00:01:42,808 --> 00:01:48,113 Cannot necessarily use the result, definitely cannot use the result because 22 00:01:48,113 --> 00:01:52,994 the in our five stage mix pipeline the result is not computed into the memory 23 00:01:52,994 --> 00:01:58,270 stage, so if you are in the SQ stage you would not have been able to get that even 24 00:01:58,270 --> 00:02:05,012 if you had bypassing out of the ends of the load end of, end of the load pipe or 25 00:02:05,012 --> 00:02:10,750 to the end of the memory stage. And one interesting thing is that the MIPS 26 00:02:10,750 --> 00:02:17,086 I Architecture. Actually defines low delay slots, very 27 00:02:17,086 --> 00:02:24,195 similar to what we have in What, what, what is what we had discussed with branch 28 00:02:24,195 --> 00:02:29,361 delay slots. So MIPS I had load delay slots, which were 29 00:02:29,361 --> 00:02:35,791 software visible, Slots that you had to fill and could solve, basically, this 30 00:02:35,791 --> 00:02:39,323 pipelining hazard. And the compiler would have to schedule 31 00:02:39,323 --> 00:02:43,017 some non-dependent instruction. So it was instruction which was not 32 00:02:43,017 --> 00:02:45,976 dependent on the load into that, that, that spot. 33 00:02:45,976 --> 00:02:52,705 This was ultimately removed out of the ISA and stalling was put back in cause as you 34 00:02:52,705 --> 00:02:56,875 went to different pipeline lanes and different micro-architectures this, this 35 00:02:56,875 --> 00:02:59,648 started to, to be onerous. And this is really one of the big problems 36 00:02:59,648 --> 00:03:02,979 with both load delay slots and branch delay slots is it's not very 37 00:03:02,979 --> 00:03:07,978 micro-architecture independent. So as you change to different micro 38 00:03:07,978 --> 00:03:13,458 architectures if you have let's say a pipeline length of five and it went to 39 00:03:13,458 --> 00:03:18,972 four, all of sudden maybe you didn't need that branch to lay slot or something, 40 00:03:18,972 --> 00:03:21,294 something like that. And. 41 00:03:21,294 --> 00:03:27,822 I wanted to sort of point out here is, this idea here, really is encapcilated in 42 00:03:27,822 --> 00:03:32,480 the name MIPS. It stands for microprocessor without 43 00:03:32,480 --> 00:03:38,340 interlocked pipeline stages. So, they really did not want to have 44 00:03:38,340 --> 00:03:46,900 interlocking here on something like the load to use of that, and later in MIPS two 45 00:03:46,900 --> 00:03:52,830 that, that was removed in pipeline interlocks were reintroduced. 46 00:03:52,830 --> 00:03:58,967 So, they, you know, we can all find mistakes that we have done and, and have 47 00:03:58,967 --> 00:04:05,719 changed it, but in the original MIPS 1ISA they had load delay slots. 48 00:04:05,719 --> 00:04:14,293 Another good reason why CPI might be greater than one is we have conditional 49 00:04:14,293 --> 00:04:20,075 branches which can cause bubbles. So this was all the control hazards we've 50 00:04:20,075 --> 00:04:25,543 been talking about up to this point, and you may have to kill the instructions if 51 00:04:25,543 --> 00:04:31,022 you don't have some sort of delay slots. Now, I wanted to point out here when we 52 00:04:31,022 --> 00:04:36,396 talk about cpi, and this is this note at the bottom of the slide, is that you 53 00:04:36,396 --> 00:04:47,048 really wanna think about Cpi from the perspective of a useable CPI, instead of 54 00:04:48,030 --> 00:04:52,078 How many instructions are executing. So if you are adding no-ops to your 55 00:04:52,078 --> 00:04:57,045 program, and the no-ops are not doing anything useful, that does not go into, 56 00:04:57,045 --> 00:05:01,082 that should not go into your useful CPI, calculation. 57 00:05:02,084 --> 00:05:07,087 Your machine might count that as valid instructions going down the pipe because 58 00:05:07,087 --> 00:05:09,092 you, it was software-invisible instructions, but that's not a good 59 00:05:09,092 --> 00:05:13,016 solution. What you should be computing in CPI, you 60 00:05:13,016 --> 00:05:18,002 should always be thinking about useful CPI, or CPI that's actually towards the 61 00:05:18,002 --> 00:05:26,230 end-goal of the program. Couple other control hazards That we need 62 00:05:26,230 --> 00:05:33,482 to talk about in this course are other things that can change your control flow 63 00:05:33,482 --> 00:05:39,072 of your program. And, those largely can fall into two 64 00:05:39,072 --> 00:05:43,036 different cases here. Exceptions and interrupts. 65 00:05:43,036 --> 00:05:48,016 And they're both related. And let's talk about what an exception is. 66 00:05:48,016 --> 00:05:52,838 So an exception is something where you have an instruction And the instruction 67 00:05:52,838 --> 00:05:59,719 does some operation, which is invalid or against what the intended use of 68 00:05:59,719 --> 00:06:05,051 machining. So, a good example of this, a couple good 69 00:06:05,051 --> 00:06:11,048 examples, is divide by zero. You take some value and divide it by zero. 70 00:06:11,048 --> 00:06:15,044 Well, on most computer architectures this is ill-defined or undefined. 71 00:06:15,044 --> 00:06:20,061 So you'll actually get an exception, which is a divide by zero error. 72 00:06:20,061 --> 00:06:23,025 And you can go try this out. You can go log in to your computers, and 73 00:06:23,025 --> 00:06:26,075 go run a little C program. Take some number, divide it by zero, 74 00:06:26,075 --> 00:06:30,060 you're going to get a div by zero error if you're running on Linux. 75 00:06:30,060 --> 00:06:33,076 And get something similar if you're running on Windows. 76 00:06:33,076 --> 00:06:40,002 And, Another good example of exceptions is, things like a, memory fault. 77 00:06:40,002 --> 00:06:41,072 Your trying to access your not allowed to go access. 78 00:06:41,072 --> 00:06:45,065 Some underflow and overflow exceptions in certain architectures. 79 00:06:45,065 --> 00:06:48,066 If like number precision goes out of, out of wack. 80 00:06:48,066 --> 00:06:53,038 If you have a, a, floating point number becomes too large or too small, and the 81 00:06:53,038 --> 00:06:58,048 floating point arithmetic can't handle the precision you'll sometimes get overflow 82 00:06:58,048 --> 00:07:02,017 and underflow exceptions. And then it interrupts our external things 83 00:07:02,017 --> 00:07:04,082 happening. And what's, So, something like a timer 84 00:07:04,082 --> 00:07:08,098 tick going off, or an IO device trying to wake up your processor, or do something to 85 00:07:08,098 --> 00:07:11,000 your processor. And why these are important, why these are 86 00:07:11,000 --> 00:07:15,059 control hazards, is these are unexpected things, sort of, coming into the, the 87 00:07:15,059 --> 00:07:17,080 instruction stream, and it's going to change. 88 00:07:17,080 --> 00:07:22,063 The subsequent instructions that are executing so, but it really is a control 89 00:07:22,063 --> 00:07:25,083 hazard flow. It's changing the program control flow. 90 00:07:26,043 --> 00:07:30,059 And we're going to be talking a lot more about exceptions and interrupts, later in 91 00:07:30,059 --> 00:07:34,086 this course, but I just wanted to get this idea across in this review so far that 92 00:07:34,086 --> 00:07:41,031 exceptions and interrupts are different types of control flow hazards.