Okay.
So now, we're going to change topics and start talking about our first technical
subject of this course. And, as an introduction to computer
architecture, we're going to be talking about what is architecture versus
microarchitecture. And, I want to just briefly say that, as
you take this class, the first three lectures or so should be review.
So, if you're sitting in the class and you're saying, oh, I've seen all this
before. Don't get up.
Wait 'till the fourth or fifth lecture, and then the content will become new.
And this is because I want to teach everything from first principles and get
everyone up to speed. But, it's that, the first three lectures
are going to go very fast. So, if you're lost in the first three
lectures, which should be review, then that's probably a bad in, indicator.
So, we'll start off by talking about architecture versus micro-architecture.
And I wanted to say briefly what I mean by architecture.
And I, I have, in this slide here, a very large A for what I'll sometimes call, big
A architecture. So, your, Patterson Hennessy calls this,
instruction set architecture, and when I contrast this with micro architecture, or
Patterson Hennessy calls organization. So, big A architecture is an abstraction
layer provided to software, or instructions set architectures or
abstraction layer provided to software which is designed to not change very much.
And, it doesn't say, it, it says how a theoretical fundamental, sort of, machine
executes programs. It does not say exactly the size of
different structures, how fast those things would run, the exact implementation
issues, that falls into organization. And, one of the things I wanted to
emphasize is that computer architecture is all about trade-offs.
So, when I say it's all about tradeoffs, you can make different design decisions up
here in the big A architecture or the instruction set architecture, and that'll
influence the application or influence the microarchitecture, but also you can make
different design decisions down here and make a lot of different tradeoffs on how
to go about implementing a particular instruction set architecture.
And, largely, when you go to look at computer architecture and computer
architecture implementation, the design space is relatively flat.
There's sort of an optimum point where you, you want to be, but the other points
around it are many times not horribly, horribly bad.
Though there are, you know, at the, at the extremes, probably horribly bad design
decisions. But, you know, a lot of different design
points are, are equally good or, or close to optimal.
And, the job of a computer architect is to make the very subtle design decisions
around how do you move around this point to make it both easier to program, lives
on for many years, is low power, and this sort of other, a little bit of aesthetic
characteristics mixed together with just making your computer processor go fast,
we'll say. And these tradeoffs, I, I will re, will
reiterate this over and over again in this class that, because there is multiple
different metrics. So, for instance, speed, energy, cost, and
they tradeoff against each other, many times, there is no necessary optimal
point. It depends on, you know, are you more cost
driven, or energy driven, or speed driven. And, within that point, there's sort of
some times Pareto optical curves where all of the points are, are equally good if
you're trying to trade off these different things for different cost models.
Okay. So, let's, let's talk about what is a
instruction set architecture, and what is a microarchitecture.
So, a instruction set architecture, or big A architecture is trying to provide the
programmer some abstract machine model. And many times what it, what it really
boils to is it's all the programmer visible state.
So, for instance, how, does the machine have memory?
Does it have registers? So that's the, that's the programmer
visible state. It also encompasses the fundamental
operations that the computer can run, so these are called instructions.
And, it defines the instructions and how they operate.
So, for instance, add. Add might be a fundamental instruction or
fundamental operation in your compu, instructional set architecture.
And, it says, the exact semantics on how to take one word in a register and add it
to another word in a register, and where it ends, ends up.
Then, there's more complicated execution semantics.
So, what do we mean by execution semantics?
Well, if you just say adds take two numbers and add them together and put them
in another register, that many times does not encompass all of the instruction set
architecture. You'll have other things going on, for
instance, IO interrupts, and you have to define in your instructions set
architecture, or your big A computer architecture what is the exact semantics
of an interrupter, a instruction, or a piece of data coming in on an IO.
How does that interact with the rest of the processor?
So, many times instruction execution semantics is only half of i, and we have
to worry about is the, the rest of the machine execution semantics.
Big A architecture has to define how the inputs and outputs work.
And finally, it has to define the data types and the sizes of the fundamental,
the, the fundamental data words that you operate on.
So, for instance, if you operate on a byte at a time, four bytes at a time, two bytes
at a time. How big is a byte if you actually have
bytes? So, this just gets into sizes.
And then, data types here might mean that you have other types of fundamental data.
So, for instance, the most basic one is you have just some bits sitting on, on, in
a, in a register in your processor. But, it could be much more complex so you
can have, for instance, something like floating point numbers.
Where it's not just a bunch of bits, it's bits formatted in a particular way, and
has very specific meaning. That's a floating point number that can
range over, let's say, most of the, the real numbers.
Okay. So, in today's lecture, we're going to,
step through all these different characteristics and requirements of
building an instruction set architecture. I wanted to, I will talk about how it's
different than microarchitecture or organization.
So, let's take up some examples of microarchitecture and organization.
So, what microarchitecture and organization is really thinking about here
is the tradeoffs as you're going to implement a fixed instruction set
architecture. So, for instance, something like Intel's
x86 is an instruction set architecture. And there's many different
microarchitecture implementations. There's the AMD versions of the chips, and
then there's the Intel versions of the chips, and even inside of, let's say, the
Intel versions of the chips. They have their high performance version
for the laptop which looks one way, or, or high performance version for, let's say, a
server or a high end laptop which looks one way.
And then, there's another chip for tablets.
Intel's trying to chips for tablets these days and they have their Atom processors.
And, internally, they look very different cuz they have very different speed,
energy, cost, tradeoffs. But, they'll execute the same code, and
they all implement the same instruction set architecture.
So, let's look at some examples of things that you might tradeoff in a
microarchitecture. So, you might have different pipeline
depth, numbers of pipelines. So, you might have one processor pipeline,
or you might have six , like something like the Core i7's today, cache sizes, how
big the chip is, the silicone area, how, what's your peak power.
Execution ordering. Well, does the code run in order, or can
you execute the code out of order? That's right.
It is possible to take a sequential program, and actually execute later
portions of the program before earlier portions of the program.
That's kind of mind boggling, but it's a way to go about getting parallelism.
And if you keep your ordering correct, things, things, work out.
Bus widths, ALU widths, if you, if you have, let's say, 64-bit machine, you can
actually go and implement that as a bunch of 1-bit adder, for instance, and people
have done things like that in the micro architecture.
And, this allows you to build more expensive or less expensive versions of
the same processor. So, let's talk about the history of why we
came up with these two differentiations between architecture and
microarchitecture. And, it came about, because software is
sort of, pushed it on us and ended up being a nice abstraction layer.
So, back in the early '50s, late '40s, you had software that people mostly programmed
either in assembly language, or machine code language.
So, you had to write ones and zeros, or you had to write assembly code.
And, sometime in the, the mid '50s we started to see library showoffs.
So, these are sort of, floating point operations were made easier, we had
transcendentals as the sine, cosine libraries, you had some matrix and
equation solvers. And, you started to see some libraries
that people could call, but people were not necessarily writing code by themselves
or writing large bodies of code in assembly programming because it's, it was
pretty painful. And then, at some point, there was the
invention of higher-level languages. So, a good example of this was Fortran
that came out in 1956, and a lot of things came along with this.
We had assemblers, loaders, linkers, compilers, bunch of other software to
track how your software's being used even. And, because we started to see these
higher-level languages, this started to give some portability to programming.
It wasn't that you had to write your program and have it only mapped to one
prog, one processor ever. And, back in the, the, the '50s, even '60s
time frame here, machines required experienced operators who could write the
programs. And, you know, you, you got these machines
and they had to be sold with a lot of software along with them so you had to,
basically, run all the software that was given cuz it was, you had to be a, a
master programmer or someone who worked for the company to even, that built the
machines to even be able to program these machines back in, in the day.
And, the idea of instruction set architectures, and these breaking the
microarchitecture from the architecture didn't really exist back then.
And, back in the early '60s, IBM had four different product lines.
And, they're all incompatible. So, you couldn't run code that you ran on
one on the other. So, to give you an example here, the, the
IBM 701 was for scientific computing. The, the 1401 was mostly for business
computation. I think they even had a second one that
was sort of for business, but different types of business computation.
And, people sort of, bought into a line. And then, as you, as the line matured and
developed, they had to either rewrite their code, or they had to stick into one
line. But, IBM had some, had some crazy insights
here is that, they didn't want to have to, when they went to the next generation of
processor, they wanted one to propagate these four lines.
They wanted to try to unify the four lines.
But, one of the problems was, these different lines had very different
implementations and different cross points.
So, the thing you were building for scientific computing wasn't necessarily
the thing you want to build for business computing.
And, the one that you built for business computing, let's say, didn't, you wanted
to not have it have very good floating point performance.
So, how do, how do they go about solving this?
And their solution was they came up something called the IBM 360.
And, the IBM 360 is probably the first true instruction set architecture that was
implemented to be instruction set architecture.
And, the idea here was they wanted to unify all these product lines into one
platform, but then implement different versions that were specialized for the
different market matrix. So, they can build, they could unify a lot
of their software system, unify a lot of what they built, but still build different
versions. So, let's, let's take a look at the IBM
360 Instruction Set Architecture, and then talk about different microarchitectures
that have been built of the IBM 360. So, the IBM 360 is a general purpose
register machine, and we'll talk more about that later in this lecture.
But, to give you an idea, this is what the programmer saw, or what the software
system saw. This isn't what was actually built in the
hardware, because that would be a microarchitecture constraint.
But, the processor state had sixteen general purpose 32-bit registers.
It had four floating point registers. It had control, flags if you will, had a,
a condition codes and control flags. And, it was a 24-bit address machine, and
at the time that was huge. So, two to the 24 was a very large number.
Nowadays, it's not so large and they've since expanded that on the IBM 360
successors. But , they thought it was good for many,
many years, and it was good for many, many years.
And they define a bunch of different data formats.
So, there's 8-bit bytes, 16-bit half words, 32-bit words, 64-bit double words.
And these were the fundamental data types that you can work on, and you can name
these different fundamental data types. And, it was actually the IBM 360 that came
up with this idea that bytes should be 8-bits long, and that's lived on, on to,
for today, Cuz before that, we had lots of different choices.
There was binary code decimal systems where the, you actually would encode a
number between zero and nine and then you have the, each digits and this is
sometimes good for, sort of, spreadsheet calculations, or business calculations, or
if you want to be very precise on your rounding to the penny.
And sometimes, bit-based things don't actually round appropriately or the, do a,
or you'll lose pennies off the end. And, so you have these binary code decimal
systems and, well, in IBM 360, they, they unified it all and said, well, no, we're
going to throw out certain things and make choices.
Now, they, of course, because it's the IBM 360 and they did have business
applications, they still supported binary code and decimal in a, a certain way.
And, let's look at the microarchitecture implementations of this first instruction
set architecture. So, at, and this is in the same time
frame, the same generation here. There was the model 30 and the model 70
and this was very, very different performance characteristics.
So, if we, we look at the machine, let's start off by looking at the storage.
The, the low end model here had between eight and 64 kilobytes, and the high end
model had between 256 and 512 kilobytes. So, very, very different sizes.
And, this is what I'm trying to get across here is that microarchitecture can
actually change quite a bit even though the architecture supports 64-bit adds in
additions, you can actually implement different size data paths.
So, in the low end machine, they had an 8-bit data path, and for ones that use
64-bit operation, it had to do eight, 8-bit operations to make up a 64-bit
operation. And then, probably, actually even do more
than that to handle all the carries correctly, versus the high-end
implementation had a full adder there. You can actually do a 64-bit add by itself
without having to do lots of micro-sequenced operations.
And, oh, yes, with minor modifications, it lives on today.
So, this was designed in the '60s, and even today we still have System 360
derivative machines. And the piece of code you ran, or you
wrote back in 1965, will still run on these machines today, which is pretty,
pretty amazing, natively. So, how does this survive on today?
So, here's actually, the IBM 360 47 years later as in the Z11 microprocessor.
So, the IBM 360 has since, it renamed to the IBM 370, and then it has been renamed
to the IBM 370EX which was in the '80. There was never any IBM 380, strangely
enough. And then, later on, they just changed the
name to the Z series. So, have a, a cooler modeling, model
numbers here so we had the IBM Z series processors, and this lives on today.
So, going back to that 8-bit processor which had a one microsecond control store
read, which is forever, we now have the Z11 which is running at 5.2 gigahertz.
It has 1.4 billion transistors. They, they have updated the addressing so
it's no longer 24-bit addressing, but it still supports the original 360
addressing. It has four cores, out of order issue, out
of order memory system, big caches on, on chip, 24 megabytes of your L3 cache.
And, you can even put multiple of these together to build a multiprocessor system
out of lots and lots of multicores. And, what I'm trying to get across here is
that, if you go forward over time and you build your instruction set architecture
correct, it can live on. And you have many different
microarchitecture implementations and still leverage the same software.
And, a few, few more examples just to, to reinforces a little bit more.
Let's take a look at an example of something where you have the same
architecture but different microarchitectures.
So, here we have the AMD Phenom X4, and here we have the Atom, Intel Atom
processor. The first Intel Atom processor.
And, what you'll notice, actually, is that they have the exact same instruction set
architecture. They both run x86 code.
And, the Zion implementations, this is, just to point out here, these are the same
time frames. So, this is a modern, modern, roughly,
modern day processors. This one has four cores, 125 watts.
Here, we have, single core two watts. So, there's design tradeoffs.
So, you're going to want to build different processors in the same design
technology, we'll say, but with very different cost, power, performance
tradeoffs. This one can decode three instructions.
This one can decode two instructions so it's a different micro architecture
difference. This one has a 64 kilobyte cache.
L1 is good as a 32 kilobyte L1i cache. Very different cache sizes, even though
they're employing the same architecture, or big A architecture.
Strangely enough, they have the same L2 size, you know, things happen.
This ones out of order versus in order, and clock speeds are very different.
And, I want to contrast this with different architecture, or different big A
architecture, and different micro architecture.
So, if we think about some different examples of instruction set architectures,
there's x86, there's PowerPC, there's IBM 360, there's Alpha, there's ARM.
You've probably heard all these different names, and these are different instruction
set architectures. So, you can't run the same software on
those two different instruction set architectures.
So, here we have an example of two different instruction set architectures
with two different microarchitectures. So, we have the Phenom X4 here, versus the
IBM Power seven. And, we already talked about the, the X4
here, but the Power seven has the power instruction set, which is different than
the x86 instruction set. So, you can't run one piece of code that's
compiled for this over here, and vice versa.
And, the microarchitectures are different. So, here, we have eight core, 200 watts,
can decode six instructions per cycle. Wow, this is a, a pretty beefy processor.
It's also out of order and has the same clock frequency.
Something that I, that can also happen is you can end up with architectures where
you have different instruction set architecture, or different big A
architecture, but almost the same microarchitecture.
And, this, this does, this does happen. So , you end up with, let's say, two
processors that are both three wide issue, same cache sizes, but, let's say, one of
the implements PowerPC and the other one implements x86.
And things, things like that do happen. That's more of a coincidence, but I'm
trying to get across the idea that many times the, that the microarchitectures can
be the same and those are more tradeoffs considerations versus the instruction set
architecture which is more of a software programming design constraint.