Fragmentation (computer)
From Wikipedia, the free encyclopedia
This article does not cite any references or sources. (June 2008) Please help improve this article by adding citations to reliable sources. Unverifiable material may be challenged and removed. |
In computer storage, fragmentation is a phenomenon that storage space is used inefficiently, reducing storage capacity. The term is also used to denote the wasted space itself.
There are three different but related forms of fragmentation: external fragmentation, internal fragmentation, and data fragmentation. Various storage allocation schemes exhibit one or more of these weaknesses. Fragmentation can be accepted in return for increase in speed or simplicity.
Contents |
[edit] Internal fragmentation
Internal fragmentation occurs when storage is allocated without ever intending to use it. This space is wasted. While this seems foolish, it is often accepted in return for increased efficiency or simplicity. The term "internal" refers to the fact that the unusable storage is inside the allocated region but is not being used.
For example, in many file systems, each file always starts at the beginning of a cluster, because this simplifies organization and makes it easier to grow files. Any space left over between the last byte of the file and the first byte of the next cluster is a form of internal fragmentation called file slack or slack space[1]. Similarly, a program which allocates a single byte of data is often allocated many additional bytes for metadata and alignment. This extra space is also internal fragmentation.
Another common example: English text is often stored with one character in each 8-bit byte even though in standard ASCII encoding the most significant bit of each byte is always zero. The unused bits are a form of internal fragmentation.
Similar problems with leaving reserved resources unused appear in many other areas. For example, IP addresses can only be reserved in blocks of certain sizes, resulting in many IPs that are reserved but not actively used. This is contributing to the IPv4 address shortage.
Unlike other types of fragmentation, internal fragmentation is difficult to reclaim; usually the best way to remove it is with a design change. For example, in dynamic memory allocation, memory pools drastically cut internal fragmentation by spreading the space overhead over a larger number of objects.
[edit] External fragmentation
External fragmentation is the phenomenon in which free storage becomes divided into many small pieces over time. It is a weakness of certain storage allocation algorithms, occurring when an application allocates and deallocates ("frees") regions of storage of varying sizes, and the allocation algorithm responds by leaving the allocated and deallocated regions interspersed. The result is that, although free storage is available, it is effectively unusable because it is divided into pieces that are too small to satisfy the demands of the application. The term "external" refers to the fact that the unusable storage is outside the allocated regions.
For example, in dynamic memory allocation, a block of 1000 bytes might be requested, but the largest contiguous block of free space has only 300 bytes. Even if there are ten blocks of 300 bytes of free space, separated by allocated regions, one still cannot allocate the requested block of 1000 bytes, and the allocation request will fail.
External fragmentation also occurs in file systems as many files of different sizes are created, change size, and are deleted. The effect is even worse if a file which is divided into many small pieces is deleted, because this leaves similarly small regions of free space.
[edit] Data fragmentation
Data fragmentation occurs when a piece of data in memory is broken up into many pieces that are not close together. It is typically the result of attempting to insert a large object into storage that has already suffered external fragmentation.
For example, files in a file system are usually managed in units called blocks or clusters. When a file system is created, there is free space to store file blocks together contiguously. This allows for rapid sequential file reads and writes. However, as files are added, removed, and changed in size, the free space becomes externally fragmented, leaving only small holes in which to place new data. When a new file is written, or when an existing file is extended, the new data blocks are necessarily scattered, slowing access due to seek time and rotational delay of the read/write head, and incurring additional overhead to manage additional locations. This is called file system fragmentation.
As another example, if the nodes of a linked list are allocated consecutively in memory, this improves locality of reference and enhances data cache performance during traversal of the list. If the memory pool's free space is fragmented, new nodes will be spread throughout memory, increasing the number of cache misses.
Just as compaction can eliminate external fragmentation, data fragmentation can be eliminated by rearranging data storage so that related pieces are close together. For example, the primary job of a defragmentation tool is to rearrange blocks on disk so that the blocks of each file are contiguous. Most defragmenting utilities also attempt to reduce or eliminate free space fragmentation. Some moving garbage collectors will also move related objects close together (this is called compacting) to improve cache performance.