Page table
From Wikipedia, the free encyclopedia
A page table is the data structure used by a virtual memory system in a computer operating system to store the mapping between virtual addresses and physical addresses. Virtual addresses are those unique to the accessing process. Physical addresses are those unique to the CPU, i.e., RAM.
Contents |
[edit] Role of memory in computer systems
A running program, be it a web browser on a PC or a Turing machine in an academic paper, is a process for transferring data between the physical world and a computer's memory, and then transforming the data in that memory. (In computer science, a program is a set of instructions, and a process is an instance of a program that is being executed.) The memory can be physically organized in many ways, e.g. a long paper tape, magnetic domains on a disk platter, or arrays of tiny capacitors on a microchip. During the 1950s, as computers became more complex, they were connected to many kinds of memory. Managing which memories were storing which bits of data became complicated, especially when multiple processes were running on the same machine at the same time.
One useful simplification was the development of virtual memory. The Kernel, a part of the operating system, keeps track of the physical location of each piece of data, and moves data between physical locations to improve performance or ensure reliability. For each "user-level" program, the operating system fabricates a single, simplified virtual memory space. Processes running inside virtual memory do not have to move data between physical devices, and do not have to allocate and reallocate portions of the fixed amount of primary memory between them. They are free to use as much of their virtual memory as necessary without reference to other processes.
[edit] Primary memory usage
A program consists of two things (simplistically):
- "text", that is, the instructions a program uses to run
- data, for "hard-wired" information that a program needs, such as string constants, for example, the strings that are in a menu in a GUI program would be stored as string constants, and for other information that can be created and destroyed as a program runs.
[edit] Stack
When a program runs, the operating system maps the program's text and data into the virtual address space, and then executes the program's instruction in memory (see von Neumann architecture). However, when the program runs, it may need to store temporary data, or more importantly, when it calls another function (a basic block of functionality), it will need to save the state of the current function. For many purposes, this data is stored on a stack, since when a function finishes executing, the temporary data stored in that function is no longer needed - so naturally, the best form of data structure to use is a stack, since we merely "pop" off this temporary data. However, this means that the stack grows dynamically over a program's lifetime. The operating system often splits up the text, data, and stack sections into separate regions, called segments, for security and reliability (so if the stack grows too large, attempting to overwrite the text or data segments, this can be more reliably detected).
[edit] Heap
Not all information is stored on the stack. Say for example we have a word processor program. At the beginning, the block of memory that may be allocated to contain the text of the document may be small, but as the user enters more text, this block needs to grow. Using the stack for this purpose doesn't work, since the stack can only accommodate fixed-size blocks of memory. What is done is that the data segment of the program is made larger (see for example, the sbrk system call), so that the block of memory for your document can fit, and that as the need for more memory arises, the segment and the block grows (see for example, realloc). So the data segment grows and shrinks dynamically also.
[edit] Size limitation
Primary memory, however, is finite. This means that there are only a finite number of memory locations we can use to store data. In many modern computer architectures, these locations are indexed in a linear fashion, that is, there is some sense of position 0 at the beginning of the block of memory, with each following position increasing linearly to the end of memory. We call a "position" an address. So, if we have 16M (megabytes) of primary memory, we are able to access addresses 0 through to 16777215. In talking in a technical fashion about memory, we often use hexadecimal, that is, base-16 numbers to describe the address, so (using the C-style notation of prefixing with 0x to denote a hexadecimal address) in the previous example, we would have addresses 0x0 through to 0xFFFFFF.
Computer address ranges are often described as "32-bit", "64-bit", and so on. This actually means that there are either 32 or 64 address lines (e.g., A0, A1 .. to A31) from the processor. Hence the maximum memory a 32-bit address system can address/access is 2 ^ 32 = 4G.
The amount of physical memory a computer can work with is limited by its address range. So that means in a 32-bit address computer the range of addressable memory locations is 0 to 4294967295 (232 − 1), giving us 4G (gigabytes) of memory to work with. This fact will become important later.
[edit] Why virtual addressing
When a program is running, it needs to make use of some form of memory in order to store information. A program may store information such as a numerical variable used in a computation, or data such as customer information in an accounting application. Many computer architectures structure memory in the following manner:
- primary memory, which is volatile (meaning that the information is lost when the computer is turned off), but relatively fast, such as RAM
- secondary memory, which is non-volatile, but relatively slow; such as a hard disk.
Virtual addressing provides a separation between the physical memory and the addresses that a program accesses to load or store data. One of the common uses of this separation is to facilitate paging to disk, which allows an operating system to make efficient use of primary memory, by moving unused instructions and data into secondary memory and then re-allocating the physical memory to another process. When the original process references the virtual address where it expects its instructions or data to be, the operating system re-allocates physical primary memory (possibly having to move some additional primary memory to secondary memory), then moves these instructions and data back to this primary memory. If the operating system gets into a situation where it is constantly moving primary memory to secondary memory (and back) in order to support the set of applications, the system will perform poorly. This is known as thrashing.
Virtual addressing provides many other benefits. One of the most important benefits of virtual addressing is that each process can write only to the physical memory for which the operating system has created page table entries. This prevents one process from overwriting another process's data. Other important benefits include providing the operating system the ability to detect references to invalid addresses (e.g. addresses with no underlying physical memory, writes on addresses that are mapped to read-only memory, etc.).
The combination of virtual addressing and two forms of backing storage (as described above) is often referred to as virtual memory, although this is somewhat of a misnomer.
[edit] Virtual addressing
Say we have a computer architecture with a 32 bit address bus. This means we are able to form addresses from 0x00000000 to 0xffffffff - spanning 4 Gi individual addresses, or 4 GiB when a single address contains 1 Byte. These addresses form what is referred to as the virtual address space. These addresses have no physical meaning - if we only have 16MB of memory, all addresses above 0x01000000 would be invalid. However, as mentioned, almost all programs do not use all 4GB of memory when a program runs, but only parts of it at a time. For example, the text, data, and stack segments may only be used and together only take 1 megabyte in total over the time where it runs.
The chunks as mentioned above are called special names. This 4GB virtual address space is split up into chunks, commonly 4KB in size, called pages. The physical memory is also split up into chunks, of the same size as pages, called frames. A program's text segment might start at the virtual address 0x00000004 - page number 0x0, and offset 0x4, but in reality, this may correspond to the physical address 0xff0e0004 - frame number 0xff0e0, and offset 0x4. What the virtual memory system does is convert virtual addresses into physical addresses, essentially, mappings between pages and frames. The page table is used for this purpose.
Many architectures also have direct hardware support for virtual memory, providing what is known as a translation lookaside buffer (TLB), which is filled with page-frame mappings initially, and instead of having the virtual memory system entirely in software, when the hardware looks up a memory address and does the page-frame translation, virtual address to page-frame mapping is cached in the TLB, which gains us a performance increase.
However, the TLB can only hold a fixed number of page-frame mappings. It is the job of the virtual memory system to extend this into software, and to hold extra page-frame mappings. The virtual memory system does so by means of a page table.
- Usage note: Some texts do not make a distinction between a "chunk" that is in physical or in virtual memory, using the term page for both, or the terms are page frame for a physical frame or a page for a virtual page. Other texts may use different terms.
[edit] Role of the page table
Say a program is running and it tries to access memory in the virtual address 0xd09fbabe. The virtual address is broken up into two: 0xd09fb is the page number and 0xabe is the offset, within the page 0xd09fb.
With hardware support for virtual memory, the address is looked up within the TLB. The TLB is specifically designed to perform this lookup in parallel, so this process is extremely fast. If there is a match for page 0xd09fb within the TLB (a TLB hit), the physical frame number is retrieved, the offset replaced, and the memory access can continue. However, if there is no match (called a TLB miss), the second port-of-call is the page table.
When the hardware is unable to find a physical frame for a virtual page, it will generate a processor interrupt called a page fault. Hardware architectures offer the chance for an interrupt handler to be installed by the operating system to deal with such page faults. The handler can look up the address mapping in the page table, and can see whether a mapping exists in the page table. If one exists, it is written back to the TLB, as the hardware accesses memory through the TLB in a virtual memory system, and the faulting instruction is restarted, with the consequence that the hardware will look in the TLB again, find the mapping, and the translation will succeed.
However, the page table lookup may not be successful for two reasons:
- there is no translation available for that address - the memory access to that virtual address is thus bad or invalid, or
- the page is not resident in physical memory (it is evicted to secondary memory).
In the first case, the memory access is invalid, and the operating system must take some action to deal with the problem. On modern operating systems, it will send a segmentation fault to the offending program. In the second case, the page is normally stored elsewhere, such as on a disk. To handle this case, the page needs to be taken from disk and put into physical memory. When physical memory is not full, this is quite simple, one simply needs to write the page into physical memory, modify the entry in the page table to say that it is present in physical memory (see the next section), write the mapping into the TLB and restart the instruction.
However, when physical memory is full, and there are no free frames available, pages in physical memory may need to be swapped with the page that needs to be written to physical memory. The page table needs to be updated to mark that the pages that were previously in physical memory are no longer so, and to mark that the page that was on disk is no longer so also (and to of course write the mapping into the TLB and restart the instruction). This process of swapping pages between physical memory and disk is known sometimes as, obviously, swapping (though the term is sometimes used to describe swapping entire processes). This process however is extremely slow in comparison to memory access via the TLB or even the page table, which lies in physical memory. Which page to swap is the subject of page replacement algorithms.
[edit] Page table data
The simplest page table systems often maintain a frame table and a page table.
The frame table, in the most basic system, holds information about which frames are mapped. In more advanced systems, the frame table can also hold information to which address space a page belongs, or statistics information, or other background information.
The page table holds the mapping between a virtual address of a page and the address of a physical frame. There is also auxiliary information about the page such as a present bit, a dirty or modified bit, address space or process ID information, amongst others. This might not be accurate information since anyone can make changes to it. Secondary storage, such as a hard disk, can be used to augment physical memory. Pages can be swapped in and out of physical memory and the disk. The present bit can indicate what pages are currently present in physical memory or are on disk, and can indicate how to treat these different pages, ie., whether to load a page from disk and swap another page in physical memory out, etc.
The dirty bit allows us a performance optimization. Say we have a page on disk that we swap in to physical memory. We can either write to this page, or we can just read from it. If we just read from it, and we need to replace this page with another, we don't need to write this page back to disk since the page hasn't changed (if we want to reload the page, we can just do so from disk again). However, if we did write to the page, we would raise the dirty flag, and this would mean we would need to write the page back so if we reload the page, we get the correct information back.
Address space or process ID information is necessary so the virtual memory management system knows what pages to associate to what process. Since the virtual memory map is the same for each process, between two processes, two identical virtual addresses could be used for different purposes, so the addresses must be somehow distinguished by identifying it with the process in the page table. This can be done by using a unique address space identifier, or by using process IDs.
As an alternative, there may be one table pages for process, and so the table pages would be a part of the process context. In such an implementation, the table page of a process can be swapped out while the process is no longer resident into memory.
[edit] Page table types
There are several different types of page tables, that are best suited for different requirements. Essentially, a bare-bones page table must store the virtual address, the physical address that is "under" this virtual address, and possibly some address space information.
[edit] Inverted page table
The inverted page table (IPT) combines a page table and a frame table into one data structure. At its core is a fixed-size table with the number of rows associating to each frame in memory. If there were 4000 frames, the inverted page table has 4000 rows. For each row there is an entry for the virtual page number (VPN), the physical page number (not the physical address), some other data and a means for creating a collision chain, as we will see later.
To search through all entries of the core IPT structure is tedious, so we use a hash table mapping virtual addresses (and address space/PID information if need be) to an index in the IPT - this is where the collision chain is used. This hash table is known as a hash anchor table. The hashing function is not generally optimized for coverage - raw speed is more desirable. Of course, hash tables experience collisions. Due to this chosen hashing function, we may experience a lot of collisions in usage, so for each entry in the table the VPN is provided to check if it is the searched entry or a collision.
In searching for a mapping, the hash anchor table is used, and if no entry exists, a page fault occurs, otherwise, the entry is found and, depending on the architecture, is placed in the TLB again and the memory reference is restarted, or the collision chain is followed until it has been exhausted and a page fault occurs.
A virtual address in this schema could be split into two, the first half being a virtual page number and the second half being the offset in that page.
[edit] Multilevel page table
The inverted page table keeps a listing of mappings installed for all frames in physical memory. However, this could be quite wasteful. Instead of doing so, we could create a page table structure that contains mappings for virtual pages. It is done by keeping several page tables that cover a certain block of virtual memory. For example, we can create smaller 1024-entry 4K pages that cover 4M of virtual memory.
This is useful since often the top-most parts and bottom-most parts of virtual memory are used in running a process - the top is often used for text and data segments whilst the bottom for stack, with free memory in between . The multilevel page table may keep a few of the smaller page tables to cover just the top and bottom parts of memory and create new ones only when strictly necessary.
Now, each of these smaller page tables are linked together by a master page table, effectively creating a tree data structure. There need not only be two levels, but possibly multiple ones.
A virtual address in this schema could be split into three, the index in the root page table, the index in the sub-page table, and the offset in that page.
[edit] Virtualized page table
It was mentioned that creating a page table structure that contained mappings for every virtual page in the virtual address space could end up being wasteful. But, we can get around the excessive space concerns by putting the page table in virtual memory, and letting the virtual memory system manage the memory for the page table.
However, part of this linear page table structure must always stay resident in physical memory, in order to prevent against circular page faults, that look for a key part of the page table that is not present in the page table, which is not present in the page table, etc.
[edit] References
- Andrew S. Tanenbaum, Modern Operating Systems, ISBN 0-13-031358-0
- A. Silberschatz, P. B. Galvin, G. Gagne, Operating System Concepts, ISBN 0-471-69466-5
- CNE Virtual Memory Tutorial, Center for the New Engineer George Mason University, Page Tables, http://cs.gmu.edu/cne/modules/vm/purple/ptable.html
- Art of Assembler, 6.6 Virtual Memory, Protection, and Paging, http://webster.cs.ucr.edu/AoA/Windows/HTML/MemoryArchitecturea3.html
- Intel 64 and IA-32 Architectures Software Developer's Manuals, http://www.intel.com/products/processor/manuals/index.htm
- AMD64 Architecture Software Developer's Manual, http://www.amd.com/us-en/Processors/DevelopWithAMD/0,,30_2252_875_7044,00.html
[edit] See also
- Virtual memory
- Paging
- Page fault
- Page replacement algorithm
- Memory management
- Pointer (computing)
- Memory management unit
- PaX
- W^X