Why is branch prediction important? So talk a little bit about motivation. Then we're going to move on and start talking about branch prediction and the two things we need to predict when we're predicting branches. And the top thing that jumps to mind is, is the branch take-in or not; the outcome of the branch. But, that's only half of the, half the story. And today, we are also going to talk about figuring out where you actually go when you take a branch. When I say branch, we're going to loosely put all forms of flow control into this. So, it's not just a branch; it's a branch, a jump. You might even think about trying to put something like a interrupt, cuz that changes your control flow of your program. But, most people try not to predict their interrupts. It's hypothetically possible. So let's start by talking about why, why branch prediction and what is the big motivation for branch prediction? So as I said longer pipelines and more complex pipelines require us to have relatively good accuracy trying to figure when we take a branch or when we don't take a branch. So here we have our in order issue or it's gonna be in order fetch, out of order issue, out of order execute, in order commit pipeline. And a couple of things we should note here is, you know, we added this extra stage out here, we added this issue stage. But we also added this issue queue or instruction buffer here, or issue window, depending what book you read in the front here. Well, instructions pile up into this. And if you don't actually figure out if the branch is taken or not, let's say until somewhere here, the execute stage. Then, you're going to have more, basically, instructions you need to kill when you take a branch mis-predict. So, when you start to go to these out of order processors, when you sort of have this seemingly short pipeline, seemingly easy pipeline. More instructions can get sort of queued up into some of these structures, especially if you have a queue. So this effectively lengthens the front of your pipeline and makes it such that if you mispredict or you fetch the wrong instructions relatively often, you're just going to be out in the weeds and you're going to be killing lots of instructions and having done extra work that you didn't really want to do. So also, if you wait all the way until the end of the pipe in sort of these out of order processors to resolve your branch, that's also going to make life even worse here cuz that makes your mispredict penalty even longer. Most people don't actually do that. I mean you might say oh, I don't want to actually kill the instructions until I know the branch commits and that was sort of a simplistic example we had when we are talking about these out of order processors we wait all the way till the end of the pipe and then sort of cleaned out things. You can wait for it to go to the end of the pipe to actually fully clean out things but you don't want to redirect in front of the pipe or redirect the fetch or the PC in the front of the pipe as quickly as possible cuz you just don't want to be fetching off into the weeds cuz you are then just wasting cycles. Here we have going back to our super pipelining lecture that we had before. And we look at the, for some real processors the, the Pentium three, and the Pentium four, what their branch mis-predict penalty is. And you know, in this Pentium four here, you have twenty, twenty odd cycles here of branch mispredicts penalty. That can be pretty painful if you take branch mispredicts quite often cuz you're going to be taking branches, and the branch penalty is going to be quite, quite high if you don't have the correct subsequent instructions after you. Now, you know, we talked about, some techniques. You could just stall and wait, so you don't actually predict the branch. But then, if you have to wait for every branch to get to, let's say, the twentieth stage of the pipe before you go and fetch the subsequent instruction, that's pretty painful. So we talked about speculating the next PC or the PC plus four, we'll say in a NIP style architecture or our architecture where the, each instruction is 32 bits long. But that doesn't really help you when you're trying to predict or when, doesn't really help you if you high probability you think the branch is going to be taken or you think control flow is going to be taken. So you need to start thinking about how to actually deal with that in a pipe line. And up to this point we've only talked about speculating the fall through case. We talked briefly about speculating the non fall through case, but we didn't say how you could possibly do that. And today we're going to talk about what the hardware is to do that. Also making, making life worse is if you start to go wide. This hurts also. So, if we start to go wide here let's say we have a dual issue processor but if you go wide here, when you go to kill instructions you are killing twice as many instructions in flight in the pipe if you take a branch the wrong direction or mispredict the branch. So showing that from our pipeline diagram perspective here, this is just recapping the incidence in a previous lecture. But, here we have a fetch for this branch. And, we're fetching two instructions per cycle here. So even if we're relatively short pipeline, you end up with one, two, three, four, five, six, seven dead instructions on a mispredict. So, what this really comes down to here is you have the pipeline width or, or approximately how much depth you end up getting killed, is the pipeline width multiplied by the branch penalty. So, its width times length before you can resolve the branch. In, if you can shorten the time it takes you to resolve the branch, that's good. Or if you can make the processor narrower, that may be good. It's good from, you know, fewer instructions being killed. But, we like to execute multiple instructions at a time cuz that improves our performance. So this is really the motivation for thinking about trying to put something useful in this time here and to also trying to reduce the probability that we start fetching in correcting instructions at all.