1 00:00:04,071 --> 00:00:14,022 Let's start off our third topic so far in our review of computer architecture. 2 00:00:14,022 --> 00:00:18,056 So this is, still review at this point. As I said at the beginning of, of this 3 00:00:18,056 --> 00:00:22,085 class, if you know everything up to this point in the class, that's a requisite or 4 00:00:22,085 --> 00:00:26,046 a prerequisite for this course. We're just going to blow through 5 00:00:26,046 --> 00:00:30,065 everything you should know in order to start the much more detailed, content 6 00:00:30,065 --> 00:00:34,648 later in the cour-, this course. We talk about real processors you're gonna 7 00:00:34,648 --> 00:00:37,096 be building. So things like out of order, multi issue, 8 00:00:37,096 --> 00:00:42,041 multi core, microprocessors versus simple little processors you should have built in 9 00:00:42,041 --> 00:00:46,038 previous classes. So, this is ELE 475 at Princeton 10 00:00:46,038 --> 00:00:50,637 University. We're talking about computer architecture, 11 00:00:50,637 --> 00:00:55,118 and today we're going to be reviewing caches. 12 00:00:55,118 --> 00:01:02,763 And this is the last topic cache review as I said before we move on to a new material 13 00:01:02,763 --> 00:01:08,241 and the new material we're gonna be moving onto very soon is superscalars or 14 00:01:08,241 --> 00:01:13,118 processors which can execute multiple instructions per cycle. 15 00:01:13,118 --> 00:01:19,016 So let's take a look at the agenda for the cash review lecture. 16 00:01:19,016 --> 00:01:26,431 We're going to start off by talking about memory technology and motivate the 17 00:01:26,431 --> 00:01:30,875 different. Or see we start off talking about memory 18 00:01:30,875 --> 00:01:34,076 technology and use that as a motivation for caches. 19 00:01:34,076 --> 00:01:39,090 So we're gonna look at different types of technology, DRAM versus SRAM, what the 20 00:01:39,090 --> 00:01:45,055 transistor technologies are behind them versus or, or why we have these different 21 00:01:45,055 --> 00:01:48,060 memory technologies? Why don't we just have one? 22 00:01:48,060 --> 00:01:51,092 What are there different ways to store information? 23 00:01:51,092 --> 00:01:55,056 Why, why don't we just use register files for everything? 24 00:01:55,056 --> 00:02:01,015 Or simple flip-flops for everything? Then we're gonna motivate cache design or 25 00:02:01,015 --> 00:02:04,059 why we have caches? So let's define what a cache is, a cache 26 00:02:04,059 --> 00:02:09,056 is a little piece of memory that you stick next to something which does not have all 27 00:02:09,056 --> 00:02:12,031 of the information that you're trying to store. 28 00:02:12,031 --> 00:02:16,504 Instead it has a subset of information and hopefully it has a useful subset of that 29 00:02:16,504 --> 00:02:20,051 information. We're then going to talk about 30 00:02:20,051 --> 00:02:26,059 classification of these different caches. So we'll put names and labels on sort of 31 00:02:26,059 --> 00:02:32,039 different cache architectures. So we're gonna talk about associativity of 32 00:02:32,039 --> 00:02:38,018 caches, we're gonna talk about size of caches, we're gonna talk about whether the 33 00:02:38,018 --> 00:02:41,071 cache has multiple things that could fit in one set or not. 34 00:02:41,071 --> 00:02:46,033 And we'll talk a bunch more about that. And then we'll have a very, very brief 35 00:02:46,033 --> 00:02:50,365 introduction to cache performance or some of the things we should know about why 36 00:02:50,365 --> 00:02:54,041 cache is going to give you good performance right now. 37 00:02:54,041 --> 00:02:59,020 Later in this course, we're gonna actually have two more lectures about advanced 38 00:02:59,020 --> 00:03:01,095 caches. Advanced caches, we're gonna talk about, 39 00:03:02,013 --> 00:03:07,034 more sophisticated topics with, how do you build an implement actual caches for very 40 00:03:07,034 --> 00:03:14,066 high performance processors? So let's start talking about memory 41 00:03:14,066 --> 00:03:15,081 technology. Okay. 42 00:03:15,081 --> 00:03:21,009 So, so let's, let's look at a, a memory. Or something that can store multiple 43 00:03:21,009 --> 00:03:26,037 values, and give it back some data. So here we have a very naive, flip flop 44 00:03:26,037 --> 00:03:30,085 based, register file. You'd probably never actually build this 45 00:03:30,085 --> 00:03:35,062 in a modern micro processor. But, let's, let's look at that sort of 46 00:03:35,062 --> 00:03:39,050 conceptual idea here of what, what a memory really is. 47 00:03:39,050 --> 00:03:42,066 Here we have four entries, that are flip flops. 48 00:03:42,066 --> 00:03:46,562 Maybe its multi bit wide? So its maybe, I don't know, lets say 49 00:03:46,562 --> 00:03:51,750 these, each of these registers are 32 bits wide and each of these buses is 32 bits 50 00:03:51,750 --> 00:03:54,714 wide. And we're gonna sort of select, based on a 51 00:03:54,714 --> 00:03:58,381 readdress some value. So we can choose, you know, this is two 52 00:03:58,383 --> 00:04:01,084 bits here, we can choose one of four places. 53 00:04:01,084 --> 00:04:06,462 We have right data that comes in and we just sort of broadcast that to all of the 54 00:04:06,462 --> 00:04:09,333 registers. And we write depending on the right 55 00:04:09,333 --> 00:04:13,583 address here and the clock. And, you know, if the decoder tells us to 56 00:04:13,583 --> 00:04:18,713 light this up and the clock clocks the actual element, it will load in the value. 57 00:04:18,713 --> 00:04:23,660 You may even have a write enable on something like this where that would also 58 00:04:23,660 --> 00:04:28,588 feed, let's say, into the end dates here. So it'll say, if it writes not occurring, 59 00:04:28,588 --> 00:04:31,755 don't actually load anything into these registers. 60 00:04:31,755 --> 00:04:37,408 Okay, so that's really, really naive. Let's take something a little bit more 61 00:04:37,408 --> 00:04:41,637 complex and how these things sort of start to actually, actually look. 62 00:04:41,637 --> 00:04:46,964 Okay, so now we're gonna transition from talking about this naive register file and 63 00:04:46,964 --> 00:04:51,567 see what people actually built. And, the first thing you should realize 64 00:04:51,567 --> 00:04:56,680 is, if you go to build something that's naive register file you have lots and lots 65 00:04:56,680 --> 00:04:59,252 of bits as you add more and more bits here. 66 00:04:59,252 --> 00:05:03,518 Here we only have four, but if we go to a thousand bit register file this 67 00:05:03,518 --> 00:05:10,538 multiplexer goes really big and if you stack these all up a thousand long your 68 00:05:10,538 --> 00:05:18,302 aspect ratio your x versus y or the size when you go to lay it down on a piece of 69 00:05:18,302 --> 00:05:24,896 silicon is, is very long and very narrow. So you will end up let's say with a 1000 70 00:05:24,896 --> 00:05:28,699 bits, one bit tall. And that does not lay out very well, and 71 00:05:28,699 --> 00:05:33,322 it's not necessarily good for speed or from a area perspective. 72 00:05:33,322 --> 00:05:38,497 So why is it not good for speed? Well, it's not good for speed because 73 00:05:38,497 --> 00:05:41,855 wire, wires have cost and take time to propagate. 74 00:05:41,855 --> 00:05:48,158 So you have to propagate from 1000 bits away or 1000, bit cells away, all the way 75 00:05:48,158 --> 00:05:53,262 over to the multiplexer. Your cycle time is not gonna be very fast 76 00:05:53,262 --> 00:05:59,395 versus if you would somehow make it more square or more rectangular, the distances 77 00:05:59,395 --> 00:06:04,991 actually are minimized. So this is how we end up with a rays or 78 00:06:04,991 --> 00:06:09,524 memory rays. And we're going to look at a memory array 79 00:06:09,524 --> 00:06:14,497 for register file first. Let's look at a register file array, and 80 00:06:14,497 --> 00:06:20,938 how you could possible actually go build this, because it's a little bit different. 81 00:06:20,938 --> 00:06:25,789 We're not going to use fully complimentary logic here for everything. 82 00:06:25,789 --> 00:06:31,009 Instead we are going to start to transition to something where we will have 83 00:06:31,009 --> 00:06:35,694 more of analog circuits connecting our bits to respective buses. 84 00:06:35,694 --> 00:06:40,126 And, we're also going to change how we do the decode. 85 00:06:40,126 --> 00:06:46,453 We're not gonna have a full multiplexer here, or a full decoder here. 86 00:06:46,453 --> 00:06:50,671 We're going to, split that into different portions. 87 00:06:50,671 --> 00:06:57,018 So here we have an example array. It's a square and we, each of these little 88 00:06:57,018 --> 00:07:01,035 boxes is a single storage element, bit cell. 89 00:07:01,035 --> 00:07:08,603 And the address comes in here and that goes up into a row decoder, which will 90 00:07:08,603 --> 00:07:15,547 turn on one of these wires at a time, and these are called word lines. 91 00:07:15,547 --> 00:07:20,292 And we'll have both read word lines and write word lines. 92 00:07:20,292 --> 00:07:29,099 This is still just for register file. But we're actually gonna split out some of 93 00:07:29,099 --> 00:07:35,095 the bits here of our address and send it over to here, which is the column decoder. 94 00:07:35,095 --> 00:07:41,629 And that's going to choose, this is, this column decoder is just a multiplexer, very 95 00:07:41,629 --> 00:07:46,764 similar to this multiplexer. But it's gonna only have a subset of the 96 00:07:46,764 --> 00:07:50,778 bits. Instead of having all of the bits that are 97 00:07:50,778 --> 00:07:56,881 together here going through this, this multiplexer and having N bit address 98 00:07:56,881 --> 00:08:00,412 coming into it, instead has a, a subset of that. 99 00:08:00,412 --> 00:08:07,151 So, in this diagram here, we have a four bit wide readout, or write, and we're 100 00:08:07,151 --> 00:08:12,447 going to have one, two, three, four, five, six, seven, eight bits in, bits across 101 00:08:12,447 --> 00:08:15,538 here. So we're gonna need a two to column 102 00:08:15,538 --> 00:08:17,984 decode. So this is a one bit decode here. 103 00:08:17,984 --> 00:08:23,094 But as a, this is just a toy example. As we go to build sort of 1000 bit arrays 104 00:08:23,094 --> 00:08:28,563 or megabyte arrays or gigabyte arrays or giga-bit arrays we're gonna have lots of 105 00:08:28,563 --> 00:08:33,464 bits coming here and lots of bits there. But the, the ideal one to get across is 106 00:08:33,464 --> 00:08:38,607 you split off a portion of the address to the word-lied creation and a portion of 107 00:08:38,607 --> 00:08:44,327 the address to your column decode here. And this register file, I wanted to just 108 00:08:44,327 --> 00:08:49,113 sort of walk through. What this diagram over here, the circuit 109 00:08:49,113 --> 00:08:53,375 little diagram is. Here we have two cross coupled inverters. 110 00:08:53,375 --> 00:08:58,148 And as you'll see we don't have full complimentary logic connecting these two 111 00:08:58,148 --> 00:09:01,056 things, but this is, this is a stable storage cell. 112 00:09:01,056 --> 00:09:07,593 If you give this power it'll store data. If we wanna do a read, we need to connect 113 00:09:07,593 --> 00:09:13,809 to the output of this to read bit lines. And these read bit lines are these 114 00:09:13,809 --> 00:09:19,289 vertical wires running here. And we're gonna use, we're gonna do this 115 00:09:19,289 --> 00:09:25,294 with effectively a pass gate here. When we energize the read word line, it 116 00:09:25,294 --> 00:09:32,397 connects the output of this gate or output of, it's gonna be this inverter here to 117 00:09:32,397 --> 00:09:37,759 the read bit line. And if we wanna do a write, we're going to 118 00:09:37,759 --> 00:09:45,942 turn on, not the read word line, RWL here. But we're trying the write word line, WWL 119 00:09:45,942 --> 00:09:53,320 and what that's gonna do is it's going to connect both the Q and Q bar to the right 120 00:09:53,320 --> 00:09:56,717 bit line. And, now we get into this sorta some 121 00:09:56,717 --> 00:10:00,901 analog, magic here. But, in order to have this work, what 122 00:10:00,901 --> 00:10:07,292 we're gonna do is we're gonna put, we're gonna energize, or, or ground the write 123 00:10:07,292 --> 00:10:14,017 bit line so that it's stronger than what's going on of these two, or stronger than 124 00:10:14,017 --> 00:10:19,063 what either of these inverters can drive. So we overpower the inverter. 125 00:10:22,001 --> 00:10:24,058 And we can flip it, let's say from a zero to one, or a one to a zero. 126 00:10:24,059 --> 00:10:29,036 So this is not traditional sort of complementary logic, but this how we 127 00:10:29,036 --> 00:10:33,440 typically build register files. Small little arrays that are close, let's 128 00:10:33,440 --> 00:10:39,010 say in to a processor. Now, lets move to something, a little bit 129 00:10:39,010 --> 00:10:41,089 larger. We are gonna look at, memory arrays. 130 00:10:41,089 --> 00:10:46,073 So, these are things like SRAMs. So we go from register files, something 131 00:10:46,073 --> 00:10:50,088 that we'd hold our general purpose register for processor in. 132 00:10:50,088 --> 00:10:56,047 And we are gonna move out a little bit and talk about small arrays, something like 133 00:10:56,047 --> 00:11:01,058 caches, maybe a kilobyte of data, and these are typically built out of SRAMs. 134 00:11:01,058 --> 00:11:07,054 Same structure, we're gonna change the cell a little bit here. 135 00:11:07,054 --> 00:11:14,090 The main difference is the cell we're gonna have two sets of bit lines. 136 00:11:14,090 --> 00:11:20,019 We're gonna have bit and bit bar. And then we're gonna have two pass gates 137 00:11:20,019 --> 00:11:23,899 here and these are cross coupled inverters in the middle. 138 00:11:23,899 --> 00:11:29,912 And by doing this, we also have this, this word line running the other direction 139 00:11:29,912 --> 00:11:34,199 which connects this cross couple inverter to the bit lines. 140 00:11:34,199 --> 00:11:39,878 By having these, SRAM arrays typically, what happens is, because the, because you 141 00:11:39,878 --> 00:11:45,221 have bit and bit bar, you can have these be even weaker than normal, and when you 142 00:11:45,221 --> 00:11:50,150 go to build something like an SRAM down here, in addition to the column decode, 143 00:11:50,150 --> 00:11:53,072 you're also gonna have, a word called, sunsamps. 144 00:11:53,072 --> 00:11:57,423 These are basically operational amplifiers, where you hook the bits and 145 00:11:57,423 --> 00:12:02,576 the bit bar to them, and it can sense a very small difference between the bit and 146 00:12:02,576 --> 00:12:05,859 bitbar. And by doing this, you can effectively 147 00:12:05,859 --> 00:12:11,122 have a very low difference between bit and bit bar, and be able to sense it at the, 148 00:12:11,122 --> 00:12:14,898 at the other side. And one of the differentiations I wanted 149 00:12:14,898 --> 00:12:22,436 to make between a register file and this SRAM here is that the register files, many 150 00:12:22,436 --> 00:12:26,133 times are designed to be sort of multi-ported. 151 00:12:26,133 --> 00:12:29,918 But in this diagram here, these bit lines get reused. 152 00:12:29,918 --> 00:12:35,513 They are both read and write bit lines. So, we have effectively built a 153 00:12:35,513 --> 00:12:39,397 single-ported SRAM. Having said that people do build 154 00:12:39,397 --> 00:12:44,748 multi-ported SRAMs, and single ported register files, but conventionally you 155 00:12:44,748 --> 00:12:50,358 build a register file when you need speed and you need lots of ports and you build 156 00:12:50,358 --> 00:12:53,808 an SRAM when you want to be more dense in your storage. 157 00:12:53,808 --> 00:12:58,307 Okay, so now let's move to a piece of technology which is in all of your 158 00:12:58,307 --> 00:13:02,982 computers or register files and SRAMs are all in your computers, but this is 159 00:13:02,982 --> 00:13:08,094 something that you can actually like see. 'Cause usually the SRAMs and the register 160 00:13:08,094 --> 00:13:13,144 files are all integrated onto your centralised microprocessor, you don't 161 00:13:13,144 --> 00:13:17,290 actually get to go, see it. But here we have a stick of DRAM. 162 00:13:17,290 --> 00:13:22,591 So let's see what this is. This is PC100 DRAM and this old stuff. 163 00:13:22,591 --> 00:13:28,231 This 128 megabytes of RAM. Nowadays your computer has, at least this 164 00:13:28,231 --> 00:13:34,466 laptop here has four gigabytes of DRAM. And different people have different 165 00:13:34,466 --> 00:13:42,441 amounts, but DRAM on the contrast with SRAM, you still go and build an array out 166 00:13:42,441 --> 00:13:46,162 of it. But the actual storage looks very 167 00:13:46,162 --> 00:13:50,717 different. The bit cell storage, instead of being 168 00:13:50,717 --> 00:13:58,099 some form of cross coupled inverse. Instead, you're gonna have one transistor 169 00:13:58,099 --> 00:14:02,455 which hooks a capacitor into your bit line. 170 00:14:02,455 --> 00:14:07,242 Now you may ask yourself how do you build a capacitor? 171 00:14:07,242 --> 00:14:13,173 The capacitor typically you'd need, you know, two plates or two pieces of metal 172 00:14:13,173 --> 00:14:18,008 with some dielectric in the middle. We can store charge. 173 00:14:18,008 --> 00:14:22,220 Well, what's inside of that RAM looks very odd. 174 00:14:22,220 --> 00:14:26,538 It, it is a capacitor, but it's a very oddly shaped capacitor. 175 00:14:26,538 --> 00:14:32,392 Typically what'll happen is you build these very, very deep trenches very long 176 00:14:32,392 --> 00:14:36,543 and skinny trenches. You want the skinny cause you wanna put 177 00:14:36,543 --> 00:14:41,354 them very close together, cause if you want gigabytes of gigabytes of RAM, the 178 00:14:41,354 --> 00:14:45,918 smaller you make it, the more you can shove on a, a single piece of silicon. 179 00:14:45,918 --> 00:14:52,109 But you have this really long and narrow trenches here and you have two plates of 180 00:14:52,109 --> 00:14:58,035 metal and then a dielectric in the middle. So in this case here, we have, I think, 181 00:14:58,035 --> 00:15:03,030 there's two metal sort of plates here and then there's some dielectrics sort of 182 00:15:03,030 --> 00:15:07,070 shoved in between there, but you can't really see it in this picture. 183 00:15:08,032 --> 00:15:14,031 And then all the actual logic, is up here. So, in this diagram here, this is the 184 00:15:14,031 --> 00:15:17,089 transistor. And [inaudible], and here's the, the, 185 00:15:18,013 --> 00:15:23,083 depletion region, here's our word line. But basically, that's, that's this 186 00:15:23,083 --> 00:15:27,387 transistor here. So, we're going to be connecting the 187 00:15:27,387 --> 00:15:34,225 capacitor which has a very funny aspect ratio, very tall to this bit line. 188 00:15:34,225 --> 00:15:40,697 And to give you some idea. This is a slice through the silicon wafer. 189 00:15:40,697 --> 00:15:46,567 This is not a plan view or top view. Top view, you would just see the, looks 190 00:15:46,567 --> 00:15:51,633 like a transistor and then some poly plug here running vertically it would look 191 00:15:51,633 --> 00:15:53,617 really small. But this is a slice. 192 00:15:53,617 --> 00:15:59,274 And the reason we do this is cause we wanna see that the actual capacitor is 193 00:15:59,274 --> 00:16:04,077 very long, very deep into this. And what they want to point out is this is 194 00:16:04,077 --> 00:16:08,369 typically really hard to go build on your standard CMOS process. 195 00:16:08,369 --> 00:16:11,803 So this is hard to go build on something a logic process. 196 00:16:11,803 --> 00:16:17,018 You have to go build this, let's say, in a special DRAM only manufacturing process. 197 00:16:17,018 --> 00:16:20,352 So you want a, it's sometimes hard to mix that. 198 00:16:20,352 --> 00:16:25,440 There are some technologies which allow you to mix them, but when you do that, 199 00:16:25,440 --> 00:16:28,885 people haven't really designed ways to make them as small. 200 00:16:28,885 --> 00:16:33,566 But DRAMs cells are small. Okay, so why, what are the advantages of 201 00:16:33,566 --> 00:16:36,397 DRAM? Why are we even talking about DRAM? 202 00:16:36,397 --> 00:16:40,888 Well, it's a lot easier in DRAM to have large amounts of storage. 203 00:16:40,888 --> 00:16:44,663 You can have big amounts of storage in the same area. 204 00:16:44,663 --> 00:16:50,553 Because instead of having, I don't know, in SRAM, six or eight transistors for each 205 00:16:50,553 --> 00:16:53,931 cell. Instead, we now have one transistor and 206 00:16:53,931 --> 00:16:56,683 one capacitor. So it's actually less. 207 00:16:56,683 --> 00:17:01,505 Now, how does this circuit work? Well logically, what we're gonna do is 208 00:17:01,505 --> 00:17:04,865 we're gonna store data, or store charge in the capacitor. 209 00:17:04,865 --> 00:17:07,828 So we're gonna connect to the capacitor to the bit line. 210 00:17:07,828 --> 00:17:10,817 We're gonna either put a one or a zero on the bit line. 211 00:17:10,817 --> 00:17:14,924 And then, we're gonna disconnect it and it'll hopefully store the charge. 212 00:17:14,924 --> 00:17:18,859 And at some point in the future, we're gonna connect it, and we're gonna 213 00:17:18,859 --> 00:17:23,218 discharge it into, the capacitor into the bit line, and read out the value. 214 00:17:23,218 --> 00:17:28,509 And we still need something at the bottom here, which is very sensitive to read out 215 00:17:28,509 --> 00:17:31,020 this bit. And the reason is cause the capacitance 216 00:17:31,020 --> 00:17:34,488 that you can store in one of these little capacitors is very small amounts, so 217 00:17:34,488 --> 00:17:37,647 there's a whole lot of charge in the, in the circuit. 218 00:17:37,647 --> 00:17:38,386 Hm. Okay. 219 00:17:38,386 --> 00:17:42,403 So what is the cell, what's, what's the problems with this? 220 00:17:42,403 --> 00:17:48,094 Well, first, one of the major problems with this, is you're gonna end up having 221 00:17:48,094 --> 00:17:55,591 this capacitor discharged and charged and capacitors as you may recall, don't always 222 00:17:55,591 --> 00:18:01,016 store their charge, that well. They might slowly, lose that charge, over 223 00:18:01,016 --> 00:18:04,044 time. And what this turns out to be, is you're 224 00:18:04,044 --> 00:18:07,056 actually going to have to, refresh the DRAM. 225 00:18:07,056 --> 00:18:13,005 So, you might have heard of DRAM refresh. But typically, you know, in a modern, 226 00:18:13,005 --> 00:18:18,076 modern day computer, your, your DRAM will only hold the data for, maybe a few 227 00:18:18,076 --> 00:18:22,038 seconds. It used to be that it, it only held it for 228 00:18:22,038 --> 00:18:25,096 a few milliseconds. Most RAM's actually decent now and you've 229 00:18:25,096 --> 00:18:28,862 probably seen some tacks, the people have built around this. 230 00:18:28,862 --> 00:18:33,112 Like there's some encryption in tacks where people will effectively turn off a 231 00:18:33,112 --> 00:18:35,905 computer and then pull the RAM out and stick it in a different computer and the 232 00:18:35,905 --> 00:18:37,299 DRAM out and stick it in a different computer. 233 00:18:37,299 --> 00:18:40,430 And the DRAM will still hold the charge, still hold the information even if you 234 00:18:40,430 --> 00:18:44,720 remove the, the electricity or remove the power from it because it's has a bunch of 235 00:18:44,720 --> 00:18:47,436 little capacitors that will store that charge. 236 00:18:47,436 --> 00:18:50,451 That's, that's a funny little case with this DRAM. 237 00:18:50,451 --> 00:18:55,282 Ends up being a negative. But we're really doing this to have more, 238 00:18:55,282 --> 00:18:59,726 more space to, to, to store data, because each of these bit cells is a lot smaller. 239 00:18:59,726 --> 00:19:06,019 Okay, so I like this diagram, this is, one of the more key diagrams in today's 240 00:19:06,019 --> 00:19:12,549 lecture, here is, showing the relative sizes of SRAM, versus DRAM, and different 241 00:19:12,549 --> 00:19:18,787 heights of SRAM versus DRAM. So let's start off here with. 242 00:19:18,787 --> 00:19:27,570 An SRAM cell built out of logic, a logic processor, logic CMOS technology. 243 00:19:27,570 --> 00:19:37,217 So that's, that's this one here. This looks to be six transistors, so sort 244 00:19:37,217 --> 00:19:44,098 of optimized SRAM monologic process and this is pretty big. 245 00:19:44,098 --> 00:19:50,097 Lets contrast that though, with this one over here, we have DRAM on a memory 246 00:19:53,029 --> 00:19:55,798 specific process and it's tiny, so it's not only one transistor versus six 247 00:19:55,801 --> 00:20:02,308 transistors.It's actually more than six times smaller because they can optimize 248 00:20:02,308 --> 00:20:06,218 this. They can go into the Z dimension here, go 249 00:20:06,218 --> 00:20:10,964 in out of the board. Because, that's, that the trench capacitor 250 00:20:10,964 --> 00:20:17,026 goes down into the substrate. And some of the other interesting things 251 00:20:17,026 --> 00:20:24,326 on, on this diagram here we have a DRAM on an Asic process, or DRAM on a traditional 252 00:20:24,326 --> 00:20:30,636 Asic process, which is, which is here. Here we have six transistor cell with 253 00:20:30,636 --> 00:20:34,401 local interconnect. It's a little bit smaller, so what that 254 00:20:34,401 --> 00:20:40,076 means by local interconnect is you can use polylayer of your process the poly silicon 255 00:20:40,076 --> 00:20:44,404 layer to do interconnections. So you don't have to use wires for 256 00:20:44,404 --> 00:20:47,794 everything. So it gets a little bit denser, the layout 257 00:20:47,794 --> 00:20:52,052 gets a little bit denser. And I really like this bottom one here. 258 00:20:52,052 --> 00:20:58,006 Yep, the bottom one labeled A is not. It's not four different things. 259 00:20:58,006 --> 00:21:04,057 Instead, what that is, this is a fully complimentary logic cell, a storage cell 260 00:21:04,057 --> 00:21:09,067 built out of gates. So, it's some number of gates put together 261 00:21:09,067 --> 00:21:13,017 here. So, this is, what's this trying to get 262 00:21:13,017 --> 00:21:19,035 across here is, that the addition of custom logic cells, here, this one this 263 00:21:19,035 --> 00:21:23,069 one, storage cells in your library is really important. 264 00:21:23,069 --> 00:21:30,059 Because, otherwise your RAM's going to be a lot, lot larger or you won't be able to 265 00:21:30,059 --> 00:21:36,026 fit as much memory on your machine. Okay, so to wrap this memory technology 266 00:21:36,026 --> 00:21:39,094 section up, I wanna talk about some of the tradeoffs. 267 00:21:39,094 --> 00:21:43,041 So in computer architecture is all about the tradeoffs. 268 00:21:43,041 --> 00:21:48,015 And, why would we use one type of technology versus another type of 269 00:21:48,015 --> 00:21:51,013 technology? So, what are our tradeoffs here? 270 00:21:51,034 --> 00:21:54,670 Or we can go from fast, close, small things. 271 00:21:54,670 --> 00:22:00,054 So things like latches and registers. At least sort of put them together into 272 00:22:00,054 --> 00:22:04,125 bigger, we can put together bigger things like something like a register file and 273 00:22:04,125 --> 00:22:07,192 then SRAM and have different technologies even. 274 00:22:07,192 --> 00:22:12,484 And as we sort of get into bigger and bigger memories, we got a lot more 275 00:22:12,484 --> 00:22:18,090 capacity, but it takes longer to access them, kinda of definition And typically we 276 00:22:18,090 --> 00:22:21,888 have less bandwidth. But if we have, small things, we have low 277 00:22:21,888 --> 00:22:25,016 capacity, low latency and very high bandwidth. 278 00:22:25,016 --> 00:22:30,004 So it's sort of a tradeoff of capacity versus the other positive aspects and 279 00:22:30,004 --> 00:22:34,091 depending on where you put it in your memory system. 280 00:22:34,091 --> 00:22:36,039 You might want to trade these off