Fragmentation (computer)
From Wikipedia, the free encyclopedia
Fragmentation is a phenomenon that leads to inefficiency in many forms of computer storage. There are three different but related uses of the term: external fragmentation, internal fragmentation, and data fragmentation. Various storage allocation schemes exhibit one or more of these weaknesses, which have the effect of reducing storage capacity.
Contents |
[edit] Internal fragmentation
Internal fragmentation occurs when storage is allocated without ever intending to use it. This space is wasted. While this seems foolish, it is often accepted in return for increased efficiency or simplicity. The term "internal" refers to the fact that the unusable storage is inside the allocated regions.
For example, in many file systems, files always start at the beginning of a sector, because this simplifies organization and makes it easier to grow files. Any space left over between the last byte of the file and the first byte of the next sector is internal fragmentation. Similarly, a program which allocates a single byte of data is often allocated many additional bytes for metadata and alignment. This extra space is also internal fragmentation.
Another common example: English text is often stored with one character in each 8-bit byte even though in standard ASCII encoding the most significant bit of each byte is always zero. The "wasted" bits are internal fragmentation.
Similar problems with leaving reserved resources unused appear in many other areas. For example, IP addresses can only be reserved in blocks of certain sizes, resulting in many IPs that are reserved but not actively used. This is contributing to the IPv4 address shortage.
Unlike other types of fragmentation, internal fragmentation is difficult to reclaim; usually the best way to remove it is with a design change. For example, in dynamic memory allocation, memory pools drastically cut internal fragmentation by spreading the space overhead over a larger number of objects.
[edit] External fragmentation
External fragmentation is the phenomenon in which free storage becomes divided into many small pieces over time. It is a weakness of certain storage allocation algorithms, occurring when an application allocates and deallocates ("frees") regions of storage of varying sizes, and the allocation algorithm responds by leaving the allocated and deallocated regions interspersed. The result is that, although free storage is available, it is effectively unusable because it is divided into pieces that are too small to satisfy the demands of the application. The term "external" refers to the fact that the unusable storage is outside the allocated regions.
For example, in dynamic memory allocation, a block of 1000 bytes might be requested, but the largest contiguous block of free space, or memory hole, has only 300 bytes. Even if there are ten blocks of 300 bytes of free space, separated by allocated regions, one still cannot allocate the requested block of 1000 bytes, and the allocation request will fail.
External fragmentation also occurs in file systems as many files of different sizes are created, change size, and are deleted. The effect is even worse if a file which is divided into many small pieces is deleted, because this leaves similarly small regions of free space.
[edit] Data fragmentation
Data fragmentation occurs when a piece of data in memory is broken up into many pieces that are not close together. It is typically the result of attempting to insert a large object into storage that has already suffered external fragmentation.
For example, files in a file system are often broken up into pieces called blocks. When a file system is newly created, there is space to store the blocks of a file all together in one place. This allows for rapid sequential file reads and writes. However, as files are added, removed, and changed in size, the disk becomes externally fragmented, leaving only small holes in which to place new data. When a new file is written, or when an existing file is extended, the new data blocks will be scattered out across the disk, slowing access due to seek time and rotational delay of the read/write head. This is called file system fragmentation.
As another example, if the nodes of a linked list are allocated consecutively in memory, this improves locality of reference and enhances data cache performance during traversal of the list. If the memory pool's free space has become fragmented, however, the linked list nodes will be spread throughout memory, increasing the number of cache misses.
Just as compaction can eliminate external fragmentation, data fragmentation can be eliminated by rearranging pieces of data so that related pieces are close together. For example, the primary job of a defragmentation tool is to rearrange blocks on disk so that the blocks of each file are contiguous and in order. Some moving garbage collectors will also move related objects close together to improve cache performance (this is called compacting).