1
00:00:04,017 --> 00:00:06,946
Okay.
So, now we're going to change off of

2
00:00:06,946 --> 00:00:12,014
machine model and talk about other aspects
of instruction set architectures.

3
00:00:12,014 --> 00:00:16,323
And, to talk about what else is in
instruction set architectures, well,

4
00:00:16,323 --> 00:00:21,271
there's the fundamental machine model, how
many registers you have, whether you, what

5
00:00:21,271 --> 00:00:24,570
type of register access you have.
Do you have stack-based?

6
00:00:24,570 --> 00:00:28,300
Do you have accumulator?
Do you have a register-register, or a

7
00:00:28,300 --> 00:00:32,532
register-memory architecture?
Also, you need to talk about what the

8
00:00:32,532 --> 00:00:37,062
fundamental operations that you have, the
fundamental instructions that you have.

9
00:00:37,062 --> 00:00:43,028
So, let's look at classes of instructions.
We start off with things like data

10
00:00:43,028 --> 00:00:47,090
transfer instructions.
So, loads, stores move to control

11
00:00:47,090 --> 00:00:49,713
registers.
So, this is what MIPS has.

12
00:00:49,713 --> 00:00:54,777
And, in this course, we're going to be
relying a lot on MIPS, the MIPS

13
00:00:54,777 --> 00:00:59,065
instruction set architecture, a lot for
our example cases.

14
00:00:59,065 --> 00:01:04,291
But , you have load, store move to and
move from control registers, with

15
00:01:04,291 --> 00:01:09,003
different control registers.
You have arithmetic logic unit

16
00:01:09,003 --> 00:01:13,275
instructions.
So, things like adding, subtracting, and

17
00:01:13,275 --> 00:01:20,034
ignoring multiplication, division.
This is an interesting one here, Set Less

18
00:01:20,034 --> 00:01:23,097
Than, that's kind of a fun one.
It's a comparison operator.

19
00:01:23,097 --> 00:01:29,006
So, if you want to take two values and
compare and see which one's less than the

20
00:01:29,006 --> 00:01:33,089
other, you can use Set Less Than.
Load Upper Immediate, this is, moving a

21
00:01:33,089 --> 00:01:37,090
value into different location registers,
how to shift operation.

22
00:01:38,084 --> 00:01:42,082
You can have control flow instructions.
So, you can do branches, jumps, traps.

23
00:01:42,082 --> 00:01:46,417
And, one of the points I want to get
across here is within, or between

24
00:01:46,417 --> 00:01:51,112
different instruction set architectures,
people make different choices about which

25
00:01:51,112 --> 00:01:54,356
instructions to have.
Some people have very complex ones, some

26
00:01:54,356 --> 00:01:59,087
have very simple ones, or some of the
architectures are very complex ones, and

27
00:01:59,087 --> 00:02:02,007
some of the architectures are very simple
ones.

28
00:02:02,007 --> 00:02:06,093
You have floating point instructions,
adding floating point numbers, multiplying

29
00:02:06,093 --> 00:02:09,835
floating point numbers, subtracting
floating point numbers.

30
00:02:09,835 --> 00:02:15,030
These are actually compare operations, oh,
excuse me, this is a compare operation

31
00:02:15,030 --> 00:02:18,092
floating point numbers.
So, compare less than for doubles, or

32
00:02:18,092 --> 00:02:22,811
double precision floating point.
Here, we have conversion operations.

33
00:02:22,811 --> 00:02:27,484
So, it's conversion from a single
precision floating point number to an

34
00:02:27,484 --> 00:02:30,396
integer number, or integer word op,
number.

35
00:02:30,396 --> 00:02:34,192
So, this is a convert, this is a, this is
the MIPS instructions.

36
00:02:34,192 --> 00:02:38,615
You can have multimedia instructions, or
what's called single instruction multiple

37
00:02:38,615 --> 00:02:41,464
data.
And, we'll be talking about send the bunch

38
00:02:41,464 --> 00:02:46,031
in this course, later when we get to data
parallelism and vector units.

39
00:02:46,031 --> 00:02:49,000
And, this is actually an example out of
x86.

40
00:02:49,000 --> 00:02:53,810
I wanted to give of stranger operations
that sometimes show up as fundamental

41
00:02:53,810 --> 00:02:58,706
operations or fundamental instructions in
instructions set architectures, is, this

42
00:02:58,706 --> 00:03:06,489
is an example called REP MOVSB.
That's not two instructions, that's one

43
00:03:06,489 --> 00:03:10,500
instruction with a prefix and a space in
between it.

44
00:03:10,500 --> 00:03:13,705
Yup.
This is actually valid Intel assembly

45
00:03:13,705 --> 00:03:15,698
code.
And what is REP MOVSB?

46
00:03:15,698 --> 00:03:21,695
Well, REP MOVSB is a string operation
where it will actually copy one string

47
00:03:21,695 --> 00:03:27,038
into another string.
So, if you have some text and you want to

48
00:03:27,038 --> 00:03:32,872
copy to another text, piece of text, you
can do REP MOVSB and set up a number and

49
00:03:32,872 --> 00:03:38,060
it will actually copy.
This is the more equivalent of something

50
00:03:38,060 --> 00:03:44,092
like store and copy.
So, we can do that all in one instruction.

51
00:03:44,092 --> 00:03:51,001
So, in addition to these complex string
operations, things like REPS MOV, REP

52
00:03:51,001 --> 00:03:55,804
MOVSB, we can see, there were sort of old
jokes about having extra and extra

53
00:03:55,804 --> 00:03:59,001
instructions, and having really complex
instructions.

54
00:03:59,001 --> 00:04:03,197
So, for instance, in the VAX architecture,
they had instructions that could do very

55
00:04:03,197 --> 00:04:06,758
complex things.
I think there was one that even did a Fast

56
00:04:06,758 --> 00:04:11,871
Fourier transform in one instruction.
That's like a whole Fast Fourier transform

57
00:04:11,871 --> 00:04:14,763
or, across a huge data set in one
instruction.

58
00:04:14,763 --> 00:04:18,873
So, you can see that there's a lot of
choice between your classes of

59
00:04:18,873 --> 00:04:23,183
instructions and the ISA architecture, the
instructions that architecture, architect

60
00:04:23,183 --> 00:04:27,125
has to sit down and think about what
should be in an instruction set versus

61
00:04:27,125 --> 00:04:33,061
being left out of an instruction set.
Another characteristic of instruction set

62
00:04:33,061 --> 00:04:40,002
architectures that the architect needs to
think about is, how do you go and Access

63
00:04:40,002 --> 00:04:43,042
Memory?
And, what are the different addressing

64
00:04:43,042 --> 00:04:47,080
modes that can be used?
So, or how do you get operands from

65
00:04:47,080 --> 00:04:51,065
memory?
So, looking at one example here, we have a

66
00:04:51,065 --> 00:04:56,771
register-based addressing mode.
So, in a register-based addressing mode,

67
00:04:56,771 --> 00:05:01,069
we can only name two registers and put
them in another register.

68
00:05:02,004 --> 00:05:05,258
And, this is a, a three operand format
here.

69
00:05:05,258 --> 00:05:09,412
X86 will have only, two.
But, you name, Register three, Register

70
00:05:09,412 --> 00:05:14,338
two, add them together and put them into
Register four, for instance.

71
00:05:14,338 --> 00:05:20,641
And, one of the, the interesting things
here is, this may not actually access any

72
00:05:20,641 --> 00:05:24,846
memory.
We call them memory mode, but it may not

73
00:05:24,846 --> 00:05:30,042
actually access memory.
If you have enough register space and your

74
00:05:30,042 --> 00:05:35,218
implementation or your micro-architecture
actually implements all the registers,

75
00:05:35,218 --> 00:05:38,656
then it won't go access memory.
But, it might access memory.

76
00:05:38,656 --> 00:05:43,541
So, for instance, there are machines out
there where you have a register, register,

77
00:05:43,541 --> 00:05:47,075
register operation, or register, register,
register instruction.

78
00:05:47,075 --> 00:05:51,296
But, the processor has no register file.
Everything is out in main memory.

79
00:05:51,296 --> 00:05:56,219
So, it has to go read the data from main
memory to go actually do the operation and

80
00:05:56,219 --> 00:06:00,226
it just sort of caches, or keeps the two
operations that are needed.

81
00:06:00,226 --> 00:06:03,390
And it's all at the micro architectural
level.

82
00:06:03,390 --> 00:06:08,416
So, this is all the big A architectural
level and asking, what is the fundamental

83
00:06:08,416 --> 00:06:14,010
memory operations that can be done?
So, that's a register-based addressing

84
00:06:14,010 --> 00:06:17,013
mode, we can have a media based addressing
modes.

85
00:06:17,013 --> 00:06:22,862
So, here we have something like a constant
of five being added to register, putting

86
00:06:22,862 --> 00:06:27,118
it into another register.
So, here's our assembly code for that.

87
00:06:27,118 --> 00:06:29,926
You can have displacement-based
addressing.

88
00:06:29,926 --> 00:06:35,547
So, in displacement we're going to take a
register value, add it to a some constant,

89
00:06:35,547 --> 00:06:40,623
and then take that and look up in main
memory, that location, and do some

90
00:06:40,623 --> 00:06:46,500
operation, let's say, of another register.
But, this is displacement based, and it's

91
00:06:46,500 --> 00:06:52,078
called displacement because you can take a
register and have some displacement off of

92
00:06:52,078 --> 00:06:56,022
it.
You can have register indirect, and this

93
00:06:56,022 --> 00:07:03,008
is, pretty common on something like MIPS,
or actually if you go look at the Itanium

94
00:07:03,008 --> 00:07:07,084
instruction set.
They don't have displacement stuff, they

95
00:07:07,084 --> 00:07:13,025
only have register indirect.
So, this is similar to the displacement,

96
00:07:13,025 --> 00:07:19,047
but you can't have a displacement.
You can only go and read from a particular

97
00:07:19,047 --> 00:07:23,011
memory address that's stored in the
register.

98
00:07:23,011 --> 00:07:28,008
You can have absolute addressing.
This is actually not very common on most

99
00:07:28,008 --> 00:07:31,465
modern-day architectures.
But, in the older, older machines, this

100
00:07:31,465 --> 00:07:36,343
was common so you take memory and take a
constant, it's not out of the register,

101
00:07:36,343 --> 00:07:41,019
and go look up in memory, and then do some
operation with that.

102
00:07:41,019 --> 00:07:46,189
You can have memory indirect.
And this is kind of interesting way to

103
00:07:46,189 --> 00:07:49,069
denote this here.
Mips very much does not have this.

104
00:07:49,069 --> 00:07:54,033
But, you could do a memory operation of a
memory operation of a register.

105
00:07:54,033 --> 00:07:58,007
So , what you'd have is, in a register,
you'd have an address.

106
00:07:58,007 --> 00:08:02,034
And then, you would take that address,
you'd look up in main memory, get the

107
00:08:02,034 --> 00:08:04,036
data.
And that itself is an address.

108
00:08:04,036 --> 00:08:07,036
And then, you'd look up in main memory
again with it.

109
00:08:07,036 --> 00:08:11,097
So, it's sort of a double index based off
a, a, a register sort of addressing mode.

110
00:08:11,097 --> 00:08:15,742
And, that's, that gets pretty fancy.
So, if you look at something like VAX,

111
00:08:15,742 --> 00:08:19,098
they definitely had this.
You can have PC relative, or program

112
00:08:19,098 --> 00:08:22,827
counter relative, or instruction pointer
relative addressing.

113
00:08:22,827 --> 00:08:28,026
So, you can take the program counter, add
some displacement, and then index memory.

114
00:08:28,026 --> 00:08:32,057
This is very useful for position
independent code, or code that you don't

115
00:08:32,057 --> 00:08:36,630
know where it's going to be loaded.
And, if you want to go access some data

116
00:08:36,630 --> 00:08:40,504
close to where the code is, you don't know
exactly where the code is loaded.

117
00:08:40,504 --> 00:08:45,540
But, the program counter, because you know
what instruction your executing, you can

118
00:08:45,540 --> 00:08:50,426
basically index off that and find memory
around where you are, around where your

119
00:08:50,426 --> 00:08:54,038
loaded in main memory.
So, this is for a PIC code.

120
00:08:54,038 --> 00:09:00,116
You can also have scaled.
This is something that x86 has where you

121
00:09:00,116 --> 00:09:07,217
can actually take a register, and add it
to another register and multiplied by

122
00:09:07,217 --> 00:09:11,514
something else.
So, in x86, this is called SIB, scale,

123
00:09:11,514 --> 00:09:16,651
index and base mode.
So, you can actually take a displacement,

124
00:09:16,651 --> 00:09:20,446
add it to some, to registers and multiply
it.

125
00:09:20,446 --> 00:09:27,172
And, this is very useful if you're trying
to index through and array of some size.

126
00:09:27,172 --> 00:09:32,742
So, if you have an array of four byte
words, you can just keeping ticking up

127
00:09:32,742 --> 00:09:36,768
this counter here.
So, you start off zero, one, two, three

128
00:09:36,768 --> 00:09:42,278
and as this ticks up here, instead of
going up by a byte, you go up by four

129
00:09:42,278 --> 00:09:45,067
bytes at a time.
And if you're, the data you're trying to

130
00:09:45,067 --> 00:09:50,044
load is four bytes long, you'll actually
be able to just pick up the exact elements

131
00:09:50,044 --> 00:09:54,071
in the array you want versus having to do
this multiplication someplace else.

132
00:09:54,071 --> 00:09:58,564
Usually, these scaled operations, or
scaled memory addressing modes have very

133
00:09:58,564 --> 00:10:03,002
limited sort of, multiplication here.
You can't multiply by, let's say, seven.

134
00:10:03,019 --> 00:10:07,062
Usually, it's sort of, multiplication by
factors of two or a small set of factors

135
00:10:07,062 --> 00:10:11,084
of two because that's, that's easy.
That's just a shift operation in base two.

136
00:10:13,688 --> 00:10:18,044
And then, you can think about data types
and their sizes.

137
00:10:18,044 --> 00:10:24,017
So, what do I mean by data types?
Well, you could have binary integer.

138
00:10:24,017 --> 00:10:29,021
You can think about having different types
of integer data.

139
00:10:29,021 --> 00:10:34,042
You can think about having, unary encoded,
binary encoded.

140
00:10:34,042 --> 00:10:41,015
You could think about having, things that
are, sort of, roll in different ways.

141
00:10:41,015 --> 00:10:46,094
So, for instance, as you probably learned
about in your computer organization class,

142
00:10:46,094 --> 00:10:50,803
there's ones complement versus twos
complement arithmetic, and that's

143
00:10:50,803 --> 00:10:55,049
different data types, there.
So, you have binary integer data, and

144
00:10:55,049 --> 00:11:00,055
saying whether it's ones complement versus
two, twos complement is, is pretty

145
00:11:00,055 --> 00:11:02,629
important.
You can have binary coded decimal.

146
00:11:02,629 --> 00:11:10,742
So, this is where each digit is encoded
with four bits from each decimal digit, if

147
00:11:10,742 --> 00:11:16,051
you will, is encoded in sort of the
pointer.

148
00:11:16,051 --> 00:11:25,027
It's going to be the period, if you will,
is, is also encoded in there between your

149
00:11:25,054 --> 00:11:31,172
fraction and the integer portion or the,
the, the, the natural number of portion.

150
00:11:31,729 --> 00:11:37,183
So, your binary coded decimal can have
different, very exact calculations for

151
00:11:37,183 --> 00:11:41,031
things like spreadsheets and business
calculations.

152
00:11:41,031 --> 00:11:46,037
You can have floating point types.
And there's actually a lot of different

153
00:11:46,037 --> 00:11:52,009
floating point types here, you can have,
there's a standardization now that's

154
00:11:52,009 --> 00:11:55,998
called IEEE 754, which is what's used in
most modern computers.

155
00:11:55,998 --> 00:12:01,094
And, this was different than the Cray
floating points on Cray supercomputers.

156
00:12:01,094 --> 00:12:08,035
They had a much wider floating point, and
they also had difference number of bits

157
00:12:08,035 --> 00:12:13,435
given to the mantissa versus the exponent.
And by doing this, their precision can be

158
00:12:13,435 --> 00:12:17,564
different in different ways.
So, for instance, you can have a bigger

159
00:12:17,564 --> 00:12:21,933
range of numbers but the precision's
smaller, or a smaller range of numbers

160
00:12:21,933 --> 00:12:26,081
with bigger precision.
And, there's different trade-offs there.

161
00:12:26,081 --> 00:12:32,277
Also, Intel internally, at least in x87,
had this thing they called Intel Extended

162
00:12:32,277 --> 00:12:36,925
Precision which is 80 bits long.
Ieee 754 the biggest thing to find in

163
00:12:36,925 --> 00:12:40,772
that, is a 64 bit double.
But, if you want even more precision to

164
00:12:40,772 --> 00:12:43,870
your floating point numbers, you might
need 80 bits.

165
00:12:43,870 --> 00:12:48,553
You could have packed vector data.
This is like MMX data where you're trying

166
00:12:48,553 --> 00:12:52,446
to pack the data all together and operate
on it at the same time.

167
00:12:52,446 --> 00:12:57,277
So, typically, things like MMX, you need
to bring the data into a packed data type,

168
00:12:57,277 --> 00:13:02,241
and then operate on a whole data type so
which has different values in it.

169
00:13:02,241 --> 00:13:08,974
And, some architectures even have a
special data type called addresses which

170
00:13:08,974 --> 00:13:14,063
is different than a binary integer.
So, some older computers actually had

171
00:13:14,063 --> 00:13:17,090
address registers.
And the address data type was different

172
00:13:17,090 --> 00:13:19,759
than the data, data type, or the binary
integer type.

173
00:13:19,759 --> 00:13:24,232
And, that was different than the floating
point data type, and there was different

174
00:13:24,232 --> 00:13:26,386
registers and different register names for
that.

175
00:13:26,386 --> 00:13:30,065
And, what was nice about that is, they
knew that if you loaded something into the

176
00:13:30,065 --> 00:13:32,540
address registers, it was definitely an
address.

177
00:13:32,540 --> 00:13:36,012
So, it had type information, and that's
separate from the width.

178
00:13:36,012 --> 00:13:40,805
So, let's say, you have binary integer.
Well, people have built machines which

179
00:13:40,805 --> 00:13:43,348
have eight bits, sixteen bits, 32 bits, 64
bits.

180
00:13:43,348 --> 00:13:47,146
All these different things that is sort of
the default word size.

181
00:13:47,146 --> 00:13:51,669
And then, finally, one of the important
things you need to do is come up with the

182
00:13:51,669 --> 00:13:56,671
encoding of the different instructions.
And, there's been a lot of debate on this

183
00:13:56,671 --> 00:14:01,145
of should you have fixed width versus
variable width instructions.

184
00:14:01,145 --> 00:14:06,241
So, let's look at a couple of different
ISA's and see where they fall, what camp

185
00:14:06,241 --> 00:14:10,466
they fall into.
So, most risk architectures are fixed

186
00:14:10,466 --> 00:14:13,520
width.
So, you have, MIPS, Power PC, SPARC, ARM,

187
00:14:13,520 --> 00:14:18,571
falling into this category.
And, as an example, MIPS which we're going

188
00:14:18,571 --> 00:14:25,004
to be talking a lot about in this course,
is, every instruction is exactly four

189
00:14:25,007 --> 00:14:29,103
bytes long.
And, what's nice about this is it's easy

190
00:14:29,103 --> 00:14:36,363
to code, but it may not be very compact.
On the other side of the, of, of this

191
00:14:36,363 --> 00:14:42,487
question about ISA encoding, you can see
variable length instructions where the

192
00:14:42,487 --> 00:14:47,790
width of the instruction can vary widely.
So, what's nice about this is you can have

193
00:14:47,790 --> 00:14:53,648
things that take up, things that are very
common to take up a very small amount of

194
00:14:53,648 --> 00:14:56,136
space.
So, if you have an instruction which is,

195
00:14:56,136 --> 00:15:00,848
like, one byte long and it's always
called, you could effectively do a manual

196
00:15:00,848 --> 00:15:05,994
Huffman encoding on your instruction set.
So, you take the most common things, and

197
00:15:05,994 --> 00:15:08,704
you put them in the smallest amount of
data.

198
00:15:08,704 --> 00:15:13,262
But, if you have something that's very
uncommon, you can have it take a lot of,

199
00:15:13,262 --> 00:15:15,978
lot of bytes.
So, example here, x86, you can have

200
00:15:15,978 --> 00:15:19,025
between one and seventeen bytes for an
instruction.

201
00:15:19,025 --> 00:15:21,086
I think this has actually been updated
now.

202
00:15:21,086 --> 00:15:25,070
If you look at x86-64, it can be between
one and eighteen bytes.

203
00:15:25,070 --> 00:15:28,847
So, and a couple ideas here.
It can be anything in between.

204
00:15:28,847 --> 00:15:31,789
One, two, three, four, all the way up to
eighteen.

205
00:15:31,789 --> 00:15:38,238
And, some CISC architectures, you have,
IBM360 is a good CISC, example of a

206
00:15:38,238 --> 00:15:45,394
complex instruction set architecture is
x86, Motorola 68k, VAX, these were all

207
00:15:45,394 --> 00:15:50,012
variable length instruction encoding
architectures.

208
00:15:50,012 --> 00:15:54,109
And now, we search again with some thing
which a little fuzzier.

209
00:15:54,109 --> 00:15:57,584
There's things that sorta start to cross
over.

210
00:15:57,584 --> 00:16:02,984
People started to look at, started to
build mostly fixed or compressed

211
00:16:02,984 --> 00:16:08,259
instruction set architecture.
So, an example of this, is something like,

212
00:16:08,259 --> 00:16:14,322
MIPS16, which is effectively a MIPS
instruction set where there is both 32

213
00:16:14,322 --> 00:16:18,825
bits or four byte instructions, and
sixteen bit or two, two instructions.

214
00:16:19,328 --> 00:16:26,767
And, a THUMB which is the compressed or
the mostly fixed instruction set

215
00:16:26,767 --> 00:16:30,449
architecture of ARM.
Yup, gotta love the naming there.

216
00:16:30,449 --> 00:16:35,380
Also did the similar sort of thing where
they had two bytes and four bytes as the

217
00:16:35,380 --> 00:16:38,085
different instructions.
This is a little bit different than

218
00:16:39,001 --> 00:16:41,023
compressed.
So, this is like a, a mostly fixed

219
00:16:41,023 --> 00:16:43,653
architecture with sort of two different
instruction sizes.

220
00:16:44,057 --> 00:16:48,923
If you look like something like Power PC
and VLI, some VLIWs, they actually have a

221
00:16:48,923 --> 00:16:53,320
compressed file, compressed format where
they will actually store the instructions

222
00:16:53,320 --> 00:16:56,584
compressed and decompress them when it
ends up in main memory.

223
00:16:56,584 --> 00:17:00,141
Or, ends up in the caches, at least.
So, you can think of some architectures

224
00:17:00,141 --> 00:17:02,363
where the, the code in main memory is
small.

225
00:17:02,363 --> 00:17:06,551
But then when we get to the cache it,
maybe it gets expanded or gets expanded

226
00:17:06,551 --> 00:17:11,584
when it comes out to the main processor.
And then, there's long instruction words

227
00:17:11,584 --> 00:17:16,435
where you actually can explicitly name
multiple instructions happening at the

228
00:17:16,435 --> 00:17:19,182
same time.
Or, even very long instruction words, or

229
00:17:19,182 --> 00:17:23,333
what's called VLIWs, which we'll be
studying a bunch in this course.

230
00:17:23,333 --> 00:17:28,719
Where you can put multiple fixed-width
instructions in a, or multiple

231
00:17:28,719 --> 00:17:35,309
instructions in a fixed-width bundle.
So, some good examples here are Multiflow,

232
00:17:35,309 --> 00:17:41,866
the LX architecture from, and also from
STMicro, the LX [inaudible] architecture

233
00:17:41,866 --> 00:17:47,429
from HP and STMicro which is, shows up in
printers today, mostly.

234
00:17:47,429 --> 00:17:55,682
Ti DSPs are actually VLIW architectures,
and a couple of other good examples.

235
00:17:55,682 --> 00:18:02,555
So, just to show here something complex of
how you can end up with one to eighteen

236
00:18:02,555 --> 00:18:05,973
bytes, here, we have x86's instruction
set.

237
00:18:05,973 --> 00:18:10,737
And, fundamentally you need an opcode, a
byte worth of opcode.

238
00:18:10,737 --> 00:18:16,067
But, you might, some instructions might
have between one and three bytes here.

239
00:18:16,067 --> 00:18:21,752
And then, there's different addressing
modes, special information about different

240
00:18:21,752 --> 00:18:26,630
addressing modes, displacements and
mediates about the different addressing

241
00:18:26,630 --> 00:18:28,861
modes.
And those all take up more space.

242
00:18:28,861 --> 00:18:33,559
And they can also have prefixes so that
REP, REP in REP, REP MOVSB is actually a

243
00:18:33,559 --> 00:18:37,266
prefix, which says, repeat this operation
multiple times.

244
00:18:37,266 --> 00:18:42,562
You can code all these things in a
variable with instruction format like x86.

245
00:18:42,562 --> 00:18:47,395
And, to give you an example, something
like MIPS, every instruction on MIPS is

246
00:18:47,395 --> 00:18:51,074
exactly four bytes long and they have to
fit everything into it.

247
00:18:51,074 --> 00:18:56,040
So, a ISA architect or Instruction Set
Architecture architect has to decide the

248
00:18:56,040 --> 00:19:01,042
layouts of the bits within the instruction
set and that's usually something that is

249
00:19:01,042 --> 00:19:04,002
defined in the instruction set
architecture.

250
00:19:04,002 --> 00:19:08,080
So, to sum up some real world instruction
sets and where they fall with different

251
00:19:08,080 --> 00:19:12,256
numbers of operand, operation, number of
memory operations, data sizes and

252
00:19:12,256 --> 00:19:16,943
registers, let's walk through a couple of
different instruction set architectures.

253
00:19:16,943 --> 00:19:21,613
And, you probably heard these in past,
heard these in passing but you may not

254
00:19:21,613 --> 00:19:26,467
have actually used any of these machines.
But, that's because some of them are

255
00:19:26,467 --> 00:19:30,897
embedded or some of them don't aren't
commonly used anymore.

256
00:19:30,897 --> 00:19:35,034
But, they're good to know about.
So, let's start off with Alpha.

257
00:19:35,034 --> 00:19:40,040
Alpha was built by Digital Equipment
Corporation, and it's a register-register

258
00:19:40,040 --> 00:19:46,005
architecture with three named operands.
There's no explicit memory operands in the

259
00:19:46,005 --> 00:19:49,424
instruction set, it's got 64 bits as the
default data type.

260
00:19:49,424 --> 00:19:54,889
And when, actually, Alpha originally came
out, you could only do 64-bit operations

261
00:19:54,889 --> 00:19:58,547
with it.
That will, sort of, later change as they

262
00:19:58,547 --> 00:20:01,874
figure out that might not have been the
best idea.

263
00:20:01,874 --> 00:20:05,967
64-bit addressing, it was mostly designed
for workstations.

264
00:20:05,967 --> 00:20:11,133
So, big addresses, fast computers, they
can see something like ARM.

265
00:20:11,133 --> 00:20:15,577
Arm is used in my cell phone.
It's a architecture that there's a lot of

266
00:20:15,577 --> 00:20:20,818
different implementations of, and they've
licensed it to lots of different people,

267
00:20:20,818 --> 00:20:24,897
but it's also register, register,
register, three operands.

268
00:20:24,897 --> 00:20:30,710
There's a, a 32 and then now is a 64-bit
data size that has just come out.

269
00:20:30,710 --> 00:20:36,680
30, it's going to be sixteen registers,
and the addressing, as I said, is a 64-bit

270
00:20:36,680 --> 00:20:43,238
version came out but it's mostly 32.
And, it shows up in cell phones embedded

271
00:20:43,238 --> 00:20:48,081
applications.
Mips which is, an outgrowth of the

272
00:20:48,081 --> 00:20:53,055
Stanford MIPS project and later was
commercialized.

273
00:20:53,080 --> 00:20:59,530
Register, register, register, we're going
to be focusing on this mostly in this

274
00:20:59,530 --> 00:21:03,218
class.
Sort of similar workstation embedded.

275
00:21:03,218 --> 00:21:08,960
Sparc is another instruction set.
This is what Sun originally used, or used

276
00:21:08,960 --> 00:21:13,875
to use.
It was an outgrowth of their risk one, and

277
00:21:13,875 --> 00:21:19,571
risk two sort of, architectures.
It has, well, this is, this one's

278
00:21:19,571 --> 00:21:23,010
interesting.
Between 24 and 32 registers depending on

279
00:21:23,010 --> 00:21:27,021
how you, you look at it.
They have this interesting idea where as

280
00:21:27,021 --> 00:21:32,036
you load more data in, sort of, or as you
do function calls, data gets spilled out

281
00:21:32,036 --> 00:21:37,252
into main memory and gets pulled back in
from main memory, kinda like a stack.

282
00:21:37,252 --> 00:21:41,936
So, it's sort of, a mixture between a
stack and a, a register architecture.

283
00:21:41,936 --> 00:21:45,859
Most of these were workstations.
You can see that TI C6000, more for DSPs.

284
00:21:45,859 --> 00:21:49,634
But then, we're going to start to see some
more interesting stuff down here.

285
00:21:49,634 --> 00:21:53,115
Let's take a look at VAX.
So, VAX is a memory, memory architecture

286
00:21:53,115 --> 00:21:57,459
where it has three named operands, or
could have up to three named operands, and

287
00:21:57,459 --> 00:22:00,018
all three of those can come from main
memory.

288
00:22:01,527 --> 00:22:06,235
It has relatively small number of
registers, and we can see something like

289
00:22:06,235 --> 00:22:11,078
the Motorola 6800.
This is not to be confused with the 68,000

290
00:22:11,078 --> 00:22:16,064
or the 68K, this is a 6800.
That has accumulator-based register, or

291
00:22:16,064 --> 00:22:22,696
accumulator-based architecture where you
can have one named operand that comes,

292
00:22:22,696 --> 00:22:27,394
comes from memory.
It's an 8-bit data path, and this is

293
00:22:27,394 --> 00:22:35,007
mostly used in a micro controller.
So, why, why all the diversity in these

294
00:22:35,007 --> 00:22:40,007
instruction set architectures?
Well, instruction set architecture is

295
00:22:40,007 --> 00:22:45,051
actually influenced by technology, or
influenced by transistor technology.

296
00:22:45,051 --> 00:22:49,504
So, we see that if storage is limited, we
might want tight encoding.

297
00:22:51,041 --> 00:22:57,050
And, on the flip side is if you have very
small number of transistors, you may want

298
00:22:57,050 --> 00:23:02,410
to fit the entire chip on there, and this
was the, actually, the fundamental idea

299
00:23:02,410 --> 00:23:06,025
behind RISC.
If you have lots and lots of transistors,

300
00:23:06,025 --> 00:23:10,003
you know, might not have to worry about
having to shove everything onto a very

301
00:23:10,003 --> 00:23:13,014
small amount of area.
You could think about adding multicore and

302
00:23:13,014 --> 00:23:16,822
many cores, or putting multiple processors
on there and build a instruction set

303
00:23:16,822 --> 00:23:20,027
architecture specifically designed for
multicores and many cores.

304
00:23:20,639 --> 00:23:26,081
And then, also, instruction sets are many
times influenced by their applications.

305
00:23:26,081 --> 00:23:32,089
So, a good example of this is, if you're
building a signal processing architecture

306
00:23:32,089 --> 00:23:37,096
or a, or a Digital Signal Processor, a
DSP, you might want to add DSP

307
00:23:37,096 --> 00:23:40,745
instructions.
And then, finally, I want to talk about

308
00:23:40,745 --> 00:23:45,464
how technology from software has
influenced instruction set architecture

309
00:23:45,464 --> 00:23:48,699
over time.
So, if we look at something like the SPARC

310
00:23:48,699 --> 00:23:52,229
architecture, it has what's called the
register window.

311
00:23:52,229 --> 00:23:57,613
So, in the register window, what happens
is whenever you do a function call, it'll

312
00:23:57,613 --> 00:24:02,413
actually take eight registers and put them
into memory, and then you get eight new

313
00:24:02,413 --> 00:24:06,297
registers.
When you do a return, it takes eight

314
00:24:06,298 --> 00:24:11,026
registers from main memory and puts it
back into your registry file, and sort of

315
00:24:11,026 --> 00:24:15,504
swaps out the ones that were there before.
And, what this was, was at the time that

316
00:24:15,504 --> 00:24:19,714
SPARC was made, compilers didn't know how
to do register allocation.

317
00:24:19,714 --> 00:24:24,363
Then, you'd lose like an open problem.
Since that time, register allocation,

318
00:24:24,363 --> 00:24:29,856
figuring out how to take a fixed number of
registers and move data in from a stack in

319
00:24:29,856 --> 00:24:34,388
main memory and vice versa can be
orchestrated very effectively and very

320
00:24:34,388 --> 00:24:38,447
efficiently by the compiler.
But, at the time, compilers were very

321
00:24:38,447 --> 00:24:41,557
simple.
So, people didn't know how to do that, so

322
00:24:41,557 --> 00:24:46,466
they needed hardware help to do that.
So, the instruction set architecture has

323
00:24:46,466 --> 00:24:50,137
that build baked into it.
But, now that we have effective register

324
00:24:50,137 --> 00:24:54,441
allocation, we've not seen any other
register windowed architectures come along

325
00:24:54,441 --> 00:24:57,476
after that.
If you talked to anyone who's actually

326
00:24:57,476 --> 00:25:02,290
went and implemented a SPARC instruction
set architecture, micro architecture, they

327
00:25:02,290 --> 00:25:06,545
basically hate register windows.
It's like the bane of this architecture.

328
00:25:06,545 --> 00:25:10,015
But, at the time compiler technology was
not good enough.

329
00:25:10,015 --> 00:25:14,842
So, applications influence it, compiler
technology influences your instruction set

330
00:25:14,842 --> 00:25:18,009
architecture.
Technology influences your ISA, and ISAs

331
00:25:18,009 --> 00:25:21,052
have evolved over time.
Even though, as we said originally, you

332
00:25:21,052 --> 00:25:25,080
know, a lot of times people want to build
ISAs that don't change so you can keep

333
00:25:25,080 --> 00:25:28,051
running software that have binary
compatibility.

334
00:25:28,051 --> 00:25:31,602
But, you know, at sometimes, at some
point, it might make sense to actually

335
00:25:31,602 --> 00:25:55,009
break the compatibility and re-optimize
your instruction set architecture.