1 00:00:03,580 --> 00:00:09,770 Well, Computer architects came up with something 2 00:00:09,770 --> 00:00:12,768 fun here. Came up with this notion of paging. 3 00:00:12,768 --> 00:00:18,420 So instead of having. Variable size programs. 4 00:00:19,140 --> 00:00:23,257 And at variable size segments or data segments. 5 00:00:23,257 --> 00:00:28,426 Instead, what you can do is you can take all of your memory. 6 00:00:28,426 --> 00:00:33,507 And you can chunk it up into different fixed size regions. 7 00:00:33,507 --> 00:00:40,604 So, for lack of anything better, let's say you have all of memory, and you chunk it 8 00:00:40,604 --> 00:00:45,160 up into four kilobyte regions. And we call it a page. 9 00:00:46,436 --> 00:00:51,700 So, what is, what is this really, turned it to? 10 00:00:51,700 --> 00:00:58,360 Well. Let's say we have the address space of a 11 00:00:58,360 --> 00:01:04,085 user here. And it thinks memory goes zero, one, two, 12 00:01:04,085 --> 00:01:08,260 three. Over here we have physical memory. 13 00:01:09,480 --> 00:01:17,518 And what you can end up with is a table which will map these different addresses, 14 00:01:17,518 --> 00:01:23,361 into different locations, possibly out of order, over there and it's a chunk at a 15 00:01:23,361 --> 00:01:26,136 time. So, if you go to access, let's say, 16 00:01:26,136 --> 00:01:30,810 address zero to four kilobytes minus one. You are accessing here. 17 00:01:30,810 --> 00:01:35,995 If you go to access four kilobytes to eight kilobytes minus one, you're 18 00:01:35,995 --> 00:01:39,720 accessing here. But this is a virtual address space. 19 00:01:40,380 --> 00:01:49,460 Once you go through translation, if we go to access, let's say, address 4,182 20 00:01:49,900 --> 00:01:53,740 rushing at mapped, up over to here into physical memory. 21 00:01:55,640 --> 00:01:59,054 And just to introduce some nomenclature here. 22 00:01:59,054 --> 00:02:03,536 If you take a processor generated address or a virtual address. 23 00:02:03,536 --> 00:02:07,164 If you go to look at what is actually inside there, 24 00:02:07,164 --> 00:02:10,578 There's a page number which is this number here. 25 00:02:10,578 --> 00:02:15,843 Which is which page to go look at. And then there's an offset, which is the 26 00:02:15,843 --> 00:02:19,400 low order bits of the address. And the offset says. 27 00:02:19,940 --> 00:02:26,060 We're inside of this physical page to go find the exact byte we're looking for. 28 00:02:26,360 --> 00:02:32,023 And it's pretty common, for these pages sizes, to be in this sort of few kilobytes 29 00:02:32,023 --> 00:02:34,960 sort of range. If you go look at something. 30 00:02:34,960 --> 00:02:39,302 I don't know, we'll, we'll, we'll take X86 for example. 31 00:02:39,302 --> 00:02:45,935 X86 default page size is four kilobytes but it also as a, two megabyte option and 32 00:02:45,935 --> 00:02:51,620 like a one gigabyte option. Some architectures like MIPS actually has 33 00:02:51,857 --> 00:02:57,463 quite a few page size options. So I think you can go anywhere from four 34 00:02:57,463 --> 00:03:03,780 kilobytes by maybe powers of two all the way up to I think maybe four megabytes. 35 00:03:03,780 --> 00:03:09,585 So you can have, different page sizes, but for right now, let's just, abstract that 36 00:03:09,585 --> 00:03:14,175 and just say, there's a fixed size of some small number of kilobytes. 37 00:03:14,175 --> 00:03:19,035 Actually, the first original machines actually had these even be smaller. 38 00:03:19,035 --> 00:03:23,220 They had'em, maybe about 512 bytes. But, As memory got bigger. 39 00:03:23,220 --> 00:03:27,419 Page sizes can get bigger. And we'll, we'll talk about why you might 40 00:03:27,419 --> 00:03:31,682 want to have bigger page sizes. But really what happens here, is page 41 00:03:31,682 --> 00:03:36,571 table allows us to take a contiguous memory, a set of memory addresses and map 42 00:03:36,571 --> 00:03:42,524 them into a non contiguous space. Now, this can really get away from having 43 00:03:42,524 --> 00:03:47,000 fragmentation. Because all of a sudden. 44 00:03:47,520 --> 00:03:53,374 The worst we can be, the, the most amount of memory we can be wasting is a size of a 45 00:03:53,374 --> 00:03:56,689 page. So, let's say that you have one byte on 46 00:03:56,689 --> 00:04:02,120 this page here and we call that as used. Well, you're, you're wost thing you're 47 00:04:02,120 --> 00:04:07,833 using is four, four kilobytes here if it's a four kilobyte page and this page and 48 00:04:07,833 --> 00:04:13,052 this page some other process can use because we have a flexible remapping 49 00:04:13,052 --> 00:04:16,437 system. It's a, an a, it's a complete mathematical 50 00:04:16,437 --> 00:04:22,094 map. So that's looking at it if you have one 51 00:04:22,094 --> 00:04:28,145 program. We can extend this pretty easily to having 52 00:04:28,145 --> 00:04:33,656 multiple programs at the same time where each program has a different page table. 53 00:04:33,656 --> 00:04:38,963 So if you have a processor, which is actually multiple programs, let's say it's 54 00:04:38,963 --> 00:04:43,657 time-multiplexing these programs, in physical memory you can actually 55 00:04:43,657 --> 00:04:49,100 intermingle all the different pages of all the different programs and do memory 56 00:04:49,100 --> 00:04:54,475 allocation where you try to pack the memory as tight as possible or re-use the 57 00:04:54,475 --> 00:04:59,647 memory as much as possible. And these really flexible page tables here 58 00:04:59,647 --> 00:05:04,983 will allow the remapping to be arbitrary. So you don't have these frag, at least you 59 00:05:04,983 --> 00:05:07,775 don't have an external fragmentation problem. 60 00:05:07,775 --> 00:05:12,986 You may still have what's called internal fragmentation, where you're just not using 61 00:05:12,986 --> 00:05:18,340 the page tables very effectively. So this is interesting up here. 62 00:05:18,720 --> 00:05:26,377 Let's figure here, we have OS pages. So, what, what's, what are OS pages? 63 00:05:26,377 --> 00:05:30,640 Well, the OS itself needs its own code and its own data. 64 00:05:31,060 --> 00:05:36,920 And many times in a paging system, DOS itself may not use a page table to go 65 00:05:36,920 --> 00:05:42,546 access memory. Because it needs to go, let's say, start 66 00:05:42,546 --> 00:05:47,584 up another program, so it needs to be able to touch these bytes here, and go, you 67 00:05:47,584 --> 00:05:51,920 know, kill a program or load some data from a particular application. 68 00:05:51,920 --> 00:05:56,448 So many times, the OS is, to large extent, held to a higher standard here. 69 00:05:56,448 --> 00:06:01,549 And the OS can actually go and touch all the physical memory and not have to go 70 00:06:01,549 --> 00:06:08,590 through a mapping layer. Okay, so here we have private address 71 00:06:08,590 --> 00:06:13,094 spaces per user. So, something we have not addressed yet 72 00:06:13,094 --> 00:06:16,880 is. Where do we store this stuff? 73 00:06:17,740 --> 00:06:26,680 And, how big are these page tables? Well, they can get pretty big. 74 00:06:28,020 --> 00:06:34,140 If you want to basically have a program that can address four gigabytes of RAM. 75 00:06:35,000 --> 00:06:39,911 You're going to need. A page for each four kilobytes of those 76 00:06:39,911 --> 00:06:45,840 four gigabytes, that's a lot of pages. So you need to store this somewhere. 77 00:06:46,820 --> 00:06:52,038 So the naive approach is you just have specialized register files in your 78 00:06:52,038 --> 00:06:59,418 processor that hold the entire page table. Unfortunately, that ends up being well, 79 00:06:59,418 --> 00:07:06,525 for a 4-gigabyte space with 4-kilobyte pages, ends up being something like, with 80 00:07:06,525 --> 00:07:13,813 I think with a 32-bit page table entry, we end up with something like 4-megabytes. 81 00:07:13,813 --> 00:07:20,650 And if we have 100 processes running, that's 400 megabytes in which to store 82 00:07:20,650 --> 00:07:22,000 somewhere. Well, 83 00:07:23,040 --> 00:07:28,162 And these we probably don't want to keep it in registers, or register files in our 84 00:07:28,162 --> 00:07:35,913 main processor. Now, conveniently, as we have more and 85 00:07:35,913 --> 00:07:42,013 more RAM. This allows us to have more and more 86 00:07:42,013 --> 00:07:46,000 processes to run. It allows us to use more memory, cuz we 87 00:07:46,000 --> 00:07:50,770 have more RAM in our machine. But also, we have more places to store 88 00:07:50,770 --> 00:07:53,760 page tables, if we want to put them in RAM. 89 00:07:56,680 --> 00:08:02,498 Okay, so. Maybe we'll put our page tables in RAM, 90 00:08:02,498 --> 00:08:08,143 because as RAM goes bigger, we might need more page tables, but we have some place 91 00:08:08,143 --> 00:08:11,140 to go put it. And it's always the case that. 92 00:08:14,480 --> 00:08:19,335 Well, if, if we design this correctly. If our page tables entries are designed 93 00:08:19,335 --> 00:08:22,721 correctly. That the size of a page table to represent 94 00:08:22,721 --> 00:08:27,895 a certain amount of ram is gonna be less amount of ram that you have to store in. 95 00:08:27,895 --> 00:08:31,856 So some fixed overhead. So let's take for example, a page table 96 00:08:31,856 --> 00:08:36,073 entry that takes four bytes to represent four kilobytes of memory. 97 00:08:36,073 --> 00:08:40,864 That's a 1,000 to one compression. So as you get more ram, you're guaranteed 98 00:08:40,864 --> 00:08:45,655 to always have some place to put it because your ram size will grow 1,000x 99 00:08:45,655 --> 00:08:49,749 faster than your page, page size. Okay. 100 00:08:49,749 --> 00:08:57,680 Challenges of storing this in RAM. If we want to go access a page. 101 00:08:58,100 --> 00:09:01,940 When we go to access a page here, we have to index this table. 102 00:09:02,480 --> 00:09:06,178 So every load we do, before was just a single load. 103 00:09:06,178 --> 00:09:12,096 But now we're gonna have to do a load to go read the page table, and a load from 104 00:09:12,096 --> 00:09:16,428 the actual physical memory location. Huh. 105 00:09:16,428 --> 00:09:21,511 Well we just doubled our number of memory references that we have. 106 00:09:21,511 --> 00:09:27,211 And that's, that's kind of a bummer. So, what do we, what do we do about this? 107 00:09:27,211 --> 00:09:33,528 Well, well before we talk about that, lets talk about where to store the page table, 108 00:09:33,528 --> 00:09:37,841 and the size. And then we talk about techniques to speed 109 00:09:37,841 --> 00:09:41,080 up the access. . 110 00:09:41,600 --> 00:09:47,140 So here let's take a look at where, where to store, a page table. 111 00:09:47,580 --> 00:09:53,114 If we have our linear page tables, what we've been talking about up to this point. 112 00:09:53,114 --> 00:09:58,853 You can actually just go store them in RAM and they'll map down to different points 113 00:09:58,853 --> 00:10:01,860 down here, and this, this works decently well. 114 00:10:03,020 --> 00:10:07,400 Unfortunately these still have to be contiguous themselves. 115 00:10:08,080 --> 00:10:12,727 So we might get fragmentation problems in the page tables themselves. 116 00:10:12,727 --> 00:10:16,835 Or different page tables fighting with each other, for memory. 117 00:10:16,835 --> 00:10:21,011 They're hard to move. So we like to think about some way to go 118 00:10:21,011 --> 00:10:24,311 solve this. We'll hold off for a second on how to 119 00:10:24,311 --> 00:10:27,543 solve that. But this is something to think about. 120 00:10:27,543 --> 00:10:30,440 It's kind of unpleasant here, to think that. 121 00:10:30,880 --> 00:10:34,660 Your page tables themselves, cause fragmentation. 122 00:10:38,640 --> 00:10:45,029 So, I just want to sort of recap and show the mechanics of this, of what's, what's 123 00:10:45,029 --> 00:10:49,862 in our page table. So in, in our page table what we're going 124 00:10:49,862 --> 00:10:52,281 to store. Is, we're gonna have. 125 00:10:52,281 --> 00:10:54,703 Well, first of all, we're gonna have, a, a. 126 00:10:54,703 --> 00:10:59,173 Instead of having a base and bound register, like we had in the base and 127 00:10:59,173 --> 00:11:03,085 bound architectures. We're gonna have a base that points to the 128 00:11:03,085 --> 00:11:07,680 base of the virtual that points to the base of the page table. 129 00:11:08,860 --> 00:11:12,375 And then, we're going to have some structure in round. 130 00:11:12,375 --> 00:11:17,284 Or structure in the page table here. And what does the structure look like. 131 00:11:17,284 --> 00:11:27,438 Well, there's probably a valid bit. There's a, a physical page number which 132 00:11:27,438 --> 00:11:35,980 points to some place over here. And maybe there's some status bits also. 133 00:11:36,300 --> 00:11:41,282 We'll talk about those more in a minute. But, you probably want some status 134 00:11:41,282 --> 00:11:45,718 registers there. Contracts some information like how often 135 00:11:45,718 --> 00:11:50,261 a page is being used, which pages are used, maybe some statistics. 136 00:11:50,261 --> 00:11:55,300 Maybe you want to know whether you can read the page or write the page. 137 00:11:59,080 --> 00:12:04,458 Now, This last node here is, is sort of the key 138 00:12:04,458 --> 00:12:09,240 to time multiplexing of processor between different processes. 139 00:12:11,140 --> 00:12:16,022 As different programs execute and your time slice are preemptively multi task 140 00:12:16,022 --> 00:12:21,155 between different operating systems or just give an applications of one operating 141 00:12:21,155 --> 00:12:23,909 system. What you're going to do you're actually 142 00:12:23,909 --> 00:12:29,440 going to change this base register. To something else. 143 00:12:29,880 --> 00:12:33,799 And effectively what this does, is it completely changes your entire map of 144 00:12:33,799 --> 00:12:38,033 memory as you're executing, and this is, pretty, pretty fancy if you think about 145 00:12:38,033 --> 00:12:39,967 it. You can just, one, one right to one 146 00:12:39,967 --> 00:12:43,260 register and all of a sudden all of the memory looks different. 147 00:12:44,140 --> 00:12:47,692 And the protection is totally different for the, the application. 148 00:12:47,692 --> 00:12:52,132 Now, you might be thinking to yourself, if I'm running the operating system, and I 149 00:12:52,132 --> 00:12:55,628 change this register. All of a sudden, it changes the entire map 150 00:12:55,628 --> 00:12:58,403 of memory. How do I not crash the operating system 151 00:12:58,403 --> 00:13:01,955 instantaneously?'Cause Cuz all of a sudden, the operating systems code just 152 00:13:01,955 --> 00:13:06,586 got to some other location. Well, there's some, there's some magic 153 00:13:06,586 --> 00:13:11,120 behind, hind the scenes here. This is what I was saying, is that there 154 00:13:11,120 --> 00:13:16,320 are a couple approaches to this that. The basic approach is that the operating 155 00:13:16,320 --> 00:13:20,120 system, will not use paged memory for its own purposes. 156 00:13:20,120 --> 00:13:25,102 It just uses physical memory for everything and there's some special bit 157 00:13:25,102 --> 00:13:30,494 that says that you're in operating system mode and what you're doing basically 158 00:13:30,494 --> 00:13:35,818 doesn't abide by the paging system. That's one approach. Another approach is 159 00:13:35,818 --> 00:13:42,786 you'll have entry's in the page table here that are all the same in every process And 160 00:13:42,786 --> 00:13:45,628 it always maps the operating system in, at a certain location. 161 00:13:45,628 --> 00:13:49,403 So we can change this willy-nilly, but all the things you go to change it to, will 162 00:13:49,403 --> 00:13:52,199 have the operating system mapped to the exact same location. 163 00:13:52,199 --> 00:13:56,020 That's another approach that people use. And this whole thing's kind of in between 164 00:13:56,020 --> 00:14:00,713 those. Just to flip back here, one thing I do 165 00:14:00,713 --> 00:14:06,569 want to point out is it's important that an application is not able to modify its 166 00:14:06,569 --> 00:14:10,640 own page table or modify other applications page tables. 167 00:14:11,080 --> 00:14:17,358 And hence, you know, this application here tries to go access something. 168 00:14:17,358 --> 00:14:20,834 It looks up at the page table, and it says, oh, that's down there. 169 00:14:20,834 --> 00:14:25,138 It's important that this does not, like, overlap with someone else's page table. 170 00:14:25,138 --> 00:14:29,055 All of a sudden, you'd allow an application to go, be able to change its 171 00:14:29,055 --> 00:14:31,980 own memory mapping. And that would be quite dangerous. 172 00:14:32,840 --> 00:14:40,170 Okay so far we've talked about oh, actually before we move off this Let's 173 00:14:40,170 --> 00:14:45,960 hope that, exact mechanics of this. So you have a virtual address. 174 00:14:46,440 --> 00:14:50,520 Has this virtual page number in the offset. 175 00:14:51,120 --> 00:14:56,371 This is actually indexed into this table. Just a basic index. 176 00:14:56,371 --> 00:15:01,054 It's, it's one huge table. I just indexes in it here and got some 177 00:15:01,054 --> 00:15:06,235 bits and those bits get basically replaced over the to, to, to take the virtual 178 00:15:06,235 --> 00:15:11,608 addressing of physical address, over take the virtual addressing we take the virtual 179 00:15:11,608 --> 00:15:16,597 page number and we are going to remove that or do that indexned here and what 180 00:15:16,597 --> 00:15:21,650 comes out of here is going to be used to be the new high order bits of this 181 00:15:21,650 --> 00:15:25,360 addressing, we are going to call that the physical address. 182 00:15:30,080 --> 00:15:35,604 One last note about this, we talk about here a disc page number, we're going to 183 00:15:35,604 --> 00:15:39,360 talk about this more at the end of today's lecture. 184 00:15:39,740 --> 00:15:45,260 But it's possible that you might want to use this location here to say the bits for 185 00:15:45,260 --> 00:15:48,678 that page are not here right now, they maybe on disc. 186 00:15:48,678 --> 00:15:54,002 We'll talk about, wanna talk about more advances is virtual memory sub system but 187 00:15:54,002 --> 00:15:59,259 lot of times a convenient place to locate that information is in the page table 188 00:15:59,259 --> 00:16:03,032 itself. Okay, 189 00:16:03,032 --> 00:16:07,020 So let's, let's crank through some math very quickly here. 190 00:16:07,840 --> 00:16:14,522 This page table is relatively large so let's take a example here, a 32 bit 191 00:16:14,522 --> 00:16:19,488 address space. It has four kilobyte pages and each page 192 00:16:19,488 --> 00:16:25,719 table entry, PTE, is four bytes. So that's two to the twenty page table 193 00:16:25,719 --> 00:16:33,214 entries, each of those is four bytes big, this is a megabyte times four, we end up 194 00:16:33,214 --> 00:16:39,328 with four megabytes. Okay, so that's four megabytes for this 195 00:16:39,328 --> 00:16:45,517 table per user process. And this assumes that we only have, let's 196 00:16:45,517 --> 00:16:50,276 say four gigabytes worth of address spaces address space. 197 00:16:50,276 --> 00:16:53,676 Four gigabytes, four gigabytes address spaces. 198 00:16:53,902 --> 00:16:59,040 That's not too horrible if you have a thousand processes running, 199 00:17:00,100 --> 00:17:03,685 That is pretty bad. And if you go type top on your system and 200 00:17:03,685 --> 00:17:06,487 you go look to see how many processes are running. 201 00:17:06,487 --> 00:17:10,913 You probably have, a few 100 running. So if we use this linear page table notion 202 00:17:10,913 --> 00:17:15,620 and you got a 1,000 processes running, all of a sudden you'd be using four gigabytes 203 00:17:15,620 --> 00:17:19,206 of memory which is your entire dress space on a 3.2 bit machine. 204 00:17:19,206 --> 00:17:22,680 Just for the page tables and have no place to put actual data. 205 00:17:23,920 --> 00:17:28,407 So I'll give you guys a hint here that people typically don't usually use linear, 206 00:17:28,407 --> 00:17:33,340 linear page tables. Because it's not very storage effective. 207 00:17:33,640 --> 00:17:37,160 Now, what are some approaches to making more storage affective? 208 00:17:37,880 --> 00:17:45,771 You can go to larger pages. So instead of having four kilobyte pages, 209 00:17:45,771 --> 00:17:49,754 we have one megabyte pages. So is, that would, that would take the 210 00:17:49,754 --> 00:17:53,240 number of page table entries and dramatically shrink it. 211 00:17:53,580 --> 00:17:57,189 And it would also directly shrink the linear page table size. 212 00:17:57,189 --> 00:18:01,212 Some downside is you start to have some more internal fragmentation. 213 00:18:01,212 --> 00:18:05,886 So if you use one byte onto a new page, and the whole rest of the page is empty, 214 00:18:05,886 --> 00:18:10,087 you'll be wasting a megabyte minus one versus four kilobytes minus one. 215 00:18:10,087 --> 00:18:13,460 So you'll have a higher amount of internal fragmentation. 216 00:18:14,340 --> 00:18:19,823 Another challenge is that, on a page fault, you're going to have to pull in a 217 00:18:19,823 --> 00:18:24,180 lot more data. Instead of pulling off of disk we'll say 218 00:18:24,180 --> 00:18:27,940 four kilobytes of memory. You might pull a megabyte. 219 00:18:28,118 --> 00:18:32,560 It's just bigger, bigger numbers to deal with, and, and this could be painful. 220 00:18:34,180 --> 00:18:40,297 So let's, let's think about actually the. Let's, let's move from a, old computer to 221 00:18:40,297 --> 00:18:44,571 a more modern computer. So a more modern computer, we have much 222 00:18:44,571 --> 00:18:48,432 larger, address spaces. We want to access larger, amounts of 223 00:18:48,432 --> 00:18:51,672 memory. So, you know, my laptop has six gigabytes 224 00:18:51,672 --> 00:18:54,912 of RAM. Well, six gigabytes of RAM, that's bigger 225 00:18:54,912 --> 00:18:58,566 than two to the 32. We can't address that, in a 32 bit 226 00:18:58,566 --> 00:19:01,668 machine. And in fact my, my laptop is a, 64bit 227 00:19:01,668 --> 00:19:04,560 machine. So. 228 00:19:04,920 --> 00:19:10,654 Let's, let's crank through the math now. For, instead of having a 32-bit machine, 229 00:19:10,654 --> 00:19:16,124 having a 64bit machine. Well, if we have relatively large pages 230 00:19:16,124 --> 00:19:21,268 let's go for a megabyte page. And, well, a megabyte page is going to 231 00:19:21,268 --> 00:19:27,882 give us two to the 44 page table entries. And these page table entries themselves 232 00:19:27,882 --> 00:19:34,577 are probably bigger because the output of them has to be lets say, 64 bits instead 233 00:19:34,577 --> 00:19:38,660 of 32 bits because the page address is larger now. 234 00:19:38,940 --> 00:19:45,800 Well, all of a sudden, one process, page tables 35 terabytes. 235 00:19:46,920 --> 00:19:53,812 Well, my laptop is a 64-bit machine, but I don't have 35 terabytes of ram in my 236 00:19:53,812 --> 00:19:58,141 computer. Doesn't, just won't, won't physically fit 237 00:19:58,141 --> 00:20:02,736 in my laptop. Can't, can't lift up 35 terabytes in 238 00:20:02,736 --> 00:20:07,242 today's, today's technologies. That's a bummer. 239 00:20:07,507 --> 00:20:14,841 So can we just not go to page systems with 64-bit address spaces or larger address 240 00:20:14,841 --> 00:20:17,990 spaces? Well, no. 241 00:20:17,990 --> 00:20:21,743 The world's, the world's not that, unforgiving. 242 00:20:21,743 --> 00:20:28,231 We do have some, some saving grace here. We can think about going to different 243 00:20:28,231 --> 00:20:34,329 arrangements of our page tables. And the other nice, thing is if we go to, 244 00:20:34,329 --> 00:20:41,521 to more space dense arrangement or page tables or we can go to the notion that our 245 00:20:41,521 --> 00:20:46,236 page tables. They'll always have every single page 246 00:20:46,236 --> 00:20:51,058 being used. So for instance, in this linear page table 247 00:20:51,058 --> 00:20:56,497 here, I have some, slash locations. And what this means is this is some 248 00:20:56,497 --> 00:20:59,760 location in memory, which doesn't have anything mapped there. 249 00:21:00,960 --> 00:21:07,299 So if this is not mapped. Why do we want to store or use the storage 250 00:21:07,299 --> 00:21:10,532 for that space? So, we can actually go with a sparse 251 00:21:10,532 --> 00:21:15,605 representation of our page tables that only have the portions of the page table 252 00:21:15,605 --> 00:21:20,530 that we actually need. And it's possible that if you bet, if 253 00:21:20,530 --> 00:21:24,582 every single page in your page table is filled out, you may not be able to get 254 00:21:24,582 --> 00:21:30,450 away with a sparse representation, and you might need all 35 terabytes to map one 255 00:21:30,450 --> 00:21:32,616 process, This is one process. 256 00:21:32,616 --> 00:21:36,871 If you have a 1,000 processes, multiply this by a 1,000. 257 00:21:37,103 --> 00:21:43,136 Well the, the, the saving grace there is, by definition, if you're gonna have one 258 00:21:43,136 --> 00:21:46,540 process which is going to map two to the 64. 259 00:21:47,060 --> 00:21:50,448 Locations of memory. That means you have 264 locations of 260 00:21:50,448 --> 00:21:55,084 memory probably in your address space somewhere, and that's much larger number, 261 00:21:55,084 --> 00:21:59,840 in your physical machine, rather, that's a much larger number than 35 terra bytes. 262 00:22:00,220 --> 00:22:06,324 So this compression of the page table entry always being smaller than the page 263 00:22:06,324 --> 00:22:12,507 itself, allows you to always store the page table entry into the memory that you 264 00:22:12,507 --> 00:22:18,871 have because it always grows to be able to fit inside of the amount of physical 265 00:22:18,871 --> 00:22:26,219 memory you have. So how, how do we go about doing this? 266 00:22:26,219 --> 00:22:31,488 Well, One representation that we can think about 267 00:22:31,488 --> 00:22:36,794 is having a heirarchical page table. So instead of having one big table that is 268 00:22:36,794 --> 00:22:42,168 linear, we have a table we index in two, which tells us where to go look for more 269 00:22:42,168 --> 00:22:45,392 data. And then we look in that table to look for 270 00:22:45,392 --> 00:22:50,762 even more data. So here we actually show a two level page 271 00:22:50,762 --> 00:22:58,543 table. And what's interesting is if we have an 272 00:22:58,543 --> 00:23:04,060 address here, we index into here. We can actually rule out huge portions of 273 00:23:04,060 --> 00:23:09,726 the address space very quickly by just leaving this particular entry in the 274 00:23:09,726 --> 00:23:17,550 higher levels of the page table empty. And then, as we go drill down a little bit 275 00:23:17,550 --> 00:23:23,558 farther, we can also have empty entries. So what this, if you're going to look at 276 00:23:23,558 --> 00:23:28,566 our, our virtual address here, we're actually going to use different portions 277 00:23:28,566 --> 00:23:32,888 of the virtual address to index into other, other structures. 278 00:23:32,888 --> 00:23:36,661 We can build, we can take this sparsity into account. 279 00:23:36,661 --> 00:23:42,081 Another cool trick we can do is we can actually reuse, if you have hierarchical 280 00:23:42,081 --> 00:23:47,525 page table like this, you can actually reuse sort of middle level page table 281 00:23:47,525 --> 00:23:52,130 entries between different processes, if they're mapping the same space. 282 00:23:52,130 --> 00:23:57,328 So, for instance, if they're mapping the same instructions, you can actually just 283 00:23:57,328 --> 00:24:02,657 reuse all these page table entries between two processes and have the in, have 284 00:24:02,657 --> 00:24:10,974 multiple pointers coming into here. One thing that this does not change is you 285 00:24:10,974 --> 00:24:18,380 still have one page table base pointer per process or per user. 286 00:24:19,560 --> 00:24:24,872 You don't need to have bases over here because the bases of the second level come 287 00:24:24,872 --> 00:24:28,889 out of the first level. So it's a tree and here, this is just a 288 00:24:28,889 --> 00:24:33,554 two level but there are plenty of architectures that have more than two 289 00:24:33,554 --> 00:24:37,636 levels of page tables. So if you go, try to build a two level 290 00:24:37,636 --> 00:24:42,884 page table, which is relatively unsparse. On something like a 64 bit machine, it 291 00:24:42,884 --> 00:24:46,901 still becomes very large. So as you have more and more address 292 00:24:46,901 --> 00:24:49,842 space. Your address space by definition has to be 293 00:24:49,842 --> 00:24:52,989 less sparse. If you have a sort of fixed amount of 294 00:24:52,989 --> 00:24:55,780 memory in it. So, what people will do is, they'll 295 00:24:55,780 --> 00:24:59,580 actually go to maybe three levels or four levels of page tables. 296 00:25:00,680 --> 00:25:05,467 Now, some down sides of this is we just took something that was, we took a load, 297 00:25:05,467 --> 00:25:09,518 which took one memory reference. And we had to add an extra memory 298 00:25:09,518 --> 00:25:14,673 reference to go look up in our page table, which was in memory to go find the actual 299 00:25:14,673 --> 00:25:18,969 address of, the physical address. And all of a sudden, we turned it into 300 00:25:18,969 --> 00:25:22,038 three memory references. One to do the actual load. 301 00:25:22,038 --> 00:25:24,861 One to look in here, and one to look into here. 302 00:25:24,861 --> 00:25:29,526 So every level of page table that we add could potentially add another load. 303 00:25:29,710 --> 00:25:33,718 That's not, that's not very pleasant. . 304 00:25:33,718 --> 00:25:39,376 But before we get to that, and ways to fix that, let's look at how this gets laid out 305 00:25:39,376 --> 00:25:44,611 in the memory. So before, we had all of the page tables 306 00:25:44,611 --> 00:25:49,232 sort of up here, and then had to deal with, with fragmentation between the 307 00:25:49,232 --> 00:25:53,283 different page tables. But now, because we have a multilevel page 308 00:25:53,283 --> 00:25:58,093 table, we can allocate it out of our physical memory addresses, sort of in an 309 00:25:58,093 --> 00:26:02,678 ad hoc manner. And one of the, the cute little tricks 310 00:26:02,678 --> 00:26:08,823 that most operating systems do, and most systems do, is they make the size of one 311 00:26:08,823 --> 00:26:15,326 of these levels of tables, to be the exact size to fit in a page cuz then you can 312 00:26:15,326 --> 00:26:20,790 locate it, easily inside of a page. And you can allocate it like a normal user 313 00:26:20,790 --> 00:26:26,680 land page, and just sort of mix and match it somewhere into the different locations 314 00:26:26,680 --> 00:26:31,065 and, and intermix it. So it's pretty convenient. 315 00:26:31,065 --> 00:26:36,880 This solves our external fragmentation problem for our page tables themselves. 316 00:26:39,100 --> 00:26:44,613 And we can use sparsity in our address space, to use less space for our actual 317 00:26:44,613 --> 00:26:47,336 page table. So, page table itself smaller, 318 00:26:47,336 --> 00:26:52,986 Down, down side is we might have to do multiple memory references to be able to 319 00:26:52,986 --> 00:26:54,960 resolve a page table address.