Now, we actually start to talk into do
things and how to build superscalar processors, things that exploit ILP,
sectional parallelism, things that run at really high clock frequency or multiple
cores and advanced techniques. So, before we start talking about
superscalar, there's a piece of nomenclature we need to introduce that
sort of goes hand in hand with the data hazard talk that we had before.
But, I didn't introduce it there, I need to introduce it now before we go into
actual superscalars. So, let's consider some example
instructions here where register i, register j, you operate r when you put it
into register k. And, we're going to look at different
types of dependencies, and we are going to name them.
So, the basic dependency here is a read after write hazard, or read after write
dependency. So, in this example here, and if, if time
goes down, this operation here is going to store into register three.
And then, this instruction here is going to read from r3.
So, you need to sort of, temporally make sure this happens before that because you
need the value here. Okay.
So, we, this is, we talked about this in our data hazards.
This is the most classic data hazard here, a read after write hazard.
Let's look at something a little bit more a little bit less intuitive.
Let's look at a hazard where this instruction here reads register one, and
the next instruction writes register one. Okay, that should be no problem.
That sounds great. Well, today, we're going to give an
example of a pipeline where this is a problem, or could potentially be a
problem. So, you know, if you do everything in
order and your instructions are sort of slowly flowing down the pipe and you're
only executing one instruction at a time, you're going to execute this, and then
that, and you're going to read register one here.
And then, like, a whole lot of time later you are going to write register one so
nothing, nothing bad happens. But, if you start to execute instructions
out of order, or if you start to execute multiple instructions at the same time,
you're going to start to come into some problems.
We're going to name this a write after read hazard.
So, what that means is, we have a write that temporally in the program order is
happening after a read of that same register.
So, when we, and this is usually called a antidependence.
And we need to, we need to maintain these when we go to execute our programs out of
order and sort of throw everything into a big bucket and try to pull out
instructions. Okay, output dependencies.
This is actually something that you could possible even think of having happen on a
simple sort of, in order processor core. If you do right back to the register file
from different stages. So, let's take an example if you have a
multiplier like in your lab which is going to write at the end of the pipe and have a
very, very high leniency and then you have a instruction like an add which is let's
say, you try to write to the register file early, you might actually write this
instructions result to register three before that instructions results.
If, let's say, this instruction here is a long leniency operation, so it could be
like a multiply, and this is like an add. So, we're going to call that a write after
write dependency. And we need to maintain the order here
that this gets written first, and then this gets written.
Because if you flop those two results or, or interchange those two results, the next
thing it goes to read r3, it can get the wrong value.
So, that's, that's pretty important. That's called an output dependence.
Okay. So, so last question, is there such a
thing as a read after read dependence or a read after read hazard?
So, superscalar processors. So far we've been limited to processors
that can only get a clock per instruction greater than or equal to one.
Superscalar processors will allow you to execute multiple instructions at the same
time and will move us into a new class here of the clock per instruction,
potentially below one. It's at least fundamentally possible.
Now, there might be other things that cause our clock per instruction to still
be above one, but we can get a higher performance by executing multiple
instructions in parallel. I want to introduce nomenclature here,
that's the reciprocal of instructions per, or clock per instruction, which is
instructions per clock. We just, we move them and rename it.
Sometimes people say IPC as the reciprocal of CPI versus CPI, clocks per instruction
equals one over instructions per clock. So, just, just be aware of that sometimes
we'll be using those terms different iInterchangeably in this class.
Okay. So, what types of superscalar processors
can we talk about? There's lots of different types.
There's in order machines and out of order machines, roughly.
And, what in order machine means is the machine or the, the processor is still
trying to execute instructions in program order.
Well, you don't have to do that. You could actually think about sort of
taking apart the program and executing them out of order as long as you're trying
to sort of, preserve the different hazards, data hazards.
And, something like your Pentium processor.
So, I'm going to pass around here a roughly Pentium [inaudible] class
processor. It's actually Intel Celeron Pentium,
Pentium [inaudible] version of the Intel Pentium Celeron, is a out of order, three
wide superscalar. So, it can execute three instructions at a
time. For instance, the, another example is the
original Pentium, the old Pentium when the original Pentium came out, that was a two
wide machine. So, it could execute two instructions at
one time and was in order. So, you can think about these different,
different notions.