Nested RAID levels
From Wikipedia, the free encyclopedia
To gain performance and/or additional redundancy the Standard RAID levels can be combined to create hybrid or Nested RAID levels.
Contents |
[edit] Nesting
When nesting RAID levels, a RAID type that provides redundancy is typically combined with RAID 0 to boost performance. With these configurations it is preferable to have RAID 0 on top and the redundant array at the bottom, because fewer disks then need to be regenerated when a disk fails. (Thus, RAID 10 is preferable to RAID 0+1 but the administrative advantages of "splitting the mirror" of RAID 1 would be lost).
[edit] RAID 0+1
A RAID 0+1 (also called RAID 01, not to be confused with RAID 1), is a RAID used for both replicating and sharing data among disks. The difference between RAID 0+1 and RAID 1+0 is the location of each RAID system — RAID 0+1 is a mirror of stripes. Consider an example of RAID 0+1: six 120 GB drives need to be set up on a RAID 0+1. Below is an example where two 360 GB level 0 arrays are mirrored, creating 360 GB of total storage space:
RAID 1 .--------------------------. | | RAID 0 RAID 0 .-----------------. .-----------------. | | | | | | 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB A1 A2 A3 A1 A2 A3 A4 A5 A6 A4 A5 A6 A7 A8 A9 A7 A8 A9 A10 A11 A12 A10 A11 A12 Note: A1, A2, et cetera each represent one data block; each column represents one disk.
The maximum storage space here is 360 GB, spread across two arrays. The advantage is that when a hard drive fails in one of the level 0 arrays, the missing data can be transferred from the other array. However, adding an extra hard drive to one stripe requires you to add an additional hard drive to the other stripes to balance out storage among the arrays.
It is not as robust as RAID 10 and cannot tolerate two simultaneous disk failures, unless the second failed disk is from the same stripe as the first. That is, once a single disk fails, each of the mechanisms in the other stripe is single point of failure. Also, once the single failed mechanism is replaced, in order to rebuild its data all the disks in the array must participate in the rebuild.
The exception to this is if all the disks are hooked up to the same raid controller in which case the controller can do the same error recovery as RAID 10 as it can still access the functional disks in each RAID 0 set. If you compare the diagrams between RAID 0+1 and RAID 10 and ignore the lines above the disks you will see that all that's different is that the disks are swapped around. If the controller has a direct link to each disk it can do the same. In this one case there is no difference between RAID 0+1 and RAID 10.
With increasingly larger capacity disk drives (driven by serial ATA drives), the risk of drive failure is increasing. Additionally, bit error correction technologies have not kept up with rapidly rising drive capacities, resulting in higher risks of encountering media errors. In the case where a failed drive is not replaced in a RAID 0+1 configuration, a single uncorrectable media error occurring on the mirrored hard drive would result in data loss.
Given these increasing risks with RAID 0+1, many business and mission critical enterprise environments are beginning to evaluate more fault tolerant RAID setups that add underlying disk parity. Among the most promising are hybrid approaches such as RAID 51 (mirroring above single parity) or RAID 61 (mirroring above dual parity).
[edit] RAID 10
A RAID 10, sometimes called RAID 1+0, or RAID 1&0, is similar to a RAID 0+1 with exception that the RAID levels used are reversed — RAID 10 is a stripe of mirrors. Below is an example where three collections of 120 GB level 1 arrays are striped together to make 360 GB of total storage space:
RAID 0 .-----------------------------------. | | | RAID 1 RAID 1 RAID 1 .--------. .--------. .--------. | | | | | | 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB A1 A1 A2 A2 A3 A3 A4 A4 A5 A5 A6 A6 A7 A7 A8 A8 A9 A9 A10 A10 A11 A11 A12 A12
Note: A1, A2, et cetera each represent one data block; each column represents one disk.
All but one drive from each RAID 1 set could fail without damaging the data. However, if the failed drive is not replaced, the single working hard drive in the set then becomes a single point of failure for the entire array. If that single hard drive then fails, all data stored in the entire array is lost. As is the case with RAID 0+1, if a failed drive is not replaced in a RAID 10 configuration then a single uncorrectable media error occurring on the mirrored hard drive would result in data loss. Some RAID 10 vendors address this problem by supporting a "hot spare" drive, which automatically replaces and rebuilds a failed drive in the array.
Given these increasing risks with RAID 10, many business and mission critical enterprise environments are beginning to evaluate more fault tolerant RAID setups that add underlying disk parity. Among the most promising are hybrid approaches such as RAID 51 (mirroring above single parity) or RAID 61 (mirroring above dual parity).
RAID 10 is often the primary choice for high-load databases, because the lack of parity to calculate gives it faster write speeds.
RAID 10 Capacity: (Size of Smallest Drive) * (Number of Drives) / 2
The Linux kernel RAID10 implementation (from version 2.6.9 and onwards) is not nested. The mirroring and striping is done in one process. Only certain layouts are standard RAID 10 with the rest being proprietary. See the Linux MD RAID 10 section in the Non-standard RAID article for details.
[edit] Raid 0+3 and 3+0
[edit] RAID 0+3
RAID level 0+3 or RAID level 03 is a dedicated parity array across striped disks. Each block of data at the RAID 3 level is broken up amongst RAID 0 arrays where the smaller pieces are striped across disks.
[edit] RAID 30
RAID level 30 is also known as striping of dedicated parity arrays. It is a combination of RAID level 3 and RAID level 0. RAID 30 provides high data transfer rates, combined with high data reliability. RAID 30 is best implemented on two RAID 3 disk arrays with data striped across both disk arrays. RAID 30 breaks up data into smaller blocks, and then stripes the blocks of data to each RAID 3 raid set. RAID 3 breaks up data into smaller blocks, calculates parity by performing an Exclusive OR on the blocks, and then writes the blocks to all but one drive in the array. The parity bit created using the Exclusive OR is then written to the last drive in each RAID 3 array. The size of each block is determined by the stripe size parameter, which is set when the RAID is created.
One drive from each of the underlying RAID 3 sets can fail. Until the failed drives are replaced the other drives in the sets that suffered such a failure are a single point of failure for the entire RAID 30 array. In other words, if one of those drives fails, all data stored in the entire array is lost. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set.
/------/------/------/------> RAID CONTROLLER <------\-------\------\------\ | | | | | | | | disk1 disk2 disk3 disk4 disk5 disk6 disk7 disk 8 | | | | | | | | A1 A2 A3 P A4 A5 A6 P1 | | | | | | | | A7 A8 A9 P A10 A11 A12 P1 | | | | | | | | A13 A14 A15 P A16 A17 A18 P1 -------RAID 3--------- ----------RAID 3--------- ----------------------- RAID 0 ----------------------------
[edit] RAID 100 (RAID 10+0)
A RAID 100, sometimes also called RAID 10+0, is a stripe of RAID 10s. RAID 100 is an example of plaid RAID, a RAID in which striped RAIDs are themselves striped together. Below is an example in which two sets of four 120 GB RAID 1 arrays are striped and re-striped to make 480 GB of total storage space:
RAID 0 .-------------------------------------. | | RAID 0 RAID 0 .-----------------. .-----------------. | | | | RAID 1 RAID 1 RAID 1 RAID 1 .--------. .--------. .--------. .--------. | | | | | | | | 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB A1 A1 A2 A2 A3 A3 A4 A4 A5 A5 A6 A6 A7 A7 A8 A8 B1 B1 B2 B2 B3 B3 B4 B4 B5 B5 B6 B6 B7 B7 B8 B8
Note: A1, B1, et cetera each represent one data sector; each column represents one disk.
All but one drive from each RAID 1 set could fail without loss of data. However, the remaining disk from the RAID 1 becomes a single point of failure for the already degraded array. Often the top level stripe is done in software. Some vendors call the top level stripe a MetaLun (Logical Unit Number (LUN)), or a Soft Stripe.
The major benefits of RAID 100 (and plaid RAID in general) over single-level RAID are better random read performance and the mitigation of hotspot risk on the array. For these reasons, RAID 100 is often the best choice for very large databases, where the underlying array software limits the amount of physical disks allowed in each standard array. Implementing nested RAID levels allows virtually limitless spindle counts in a single logical volume.
[edit] RAID 50 (RAID 5+0)
A RAID 50 combines the straight block-level striping of RAID 0 with the distributed parity of RAID 5. This is a RAID 0 array striped across RAID 5 elements.
Below is an example where three collections of 240 GB RAID 5s are striped together to make 720 GB of total storage space:
RAID 0 .-----------------------------------------------------. | | | RAID 5 RAID 5 RAID 5 .-----------------. .-----------------. .-----------------. | | | | | | | | | 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB A1 A2 Ap A3 A4 Ap A5 A6 Ap B1 Bp B2 B3 Bp B4 B5 Bp B6 Cp C1 C2 Cp C3 C4 Cp C5 C6 D1 D2 Dp D3 D4 Dp D5 D6 Dp
Note: A1, B1, et cetera each represent one data block; each column represents one disk; Ap, Bp, et cetera each represent parity information for each distinct RAID 5 and may represent different values across the RAID 5 (that is, Ap for A1 and A2 can differ from Ap for A3 and A4).
One drive from each of the RAID 5 sets could fail without loss of data. However, if the failed drive is not replaced, the remaining drives in that set then become a single point of failure for the entire array. If one of those drives fails, all data stored in the entire array is lost. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set.
In the example below, datasets may be striped across both RAID sets. A dataset with 5 blocks would have 3 blocks written to the first RAID set, and the next 2 blocks written to RAID set 2.
RAID Set 1 RAID Set 2 A1 A2 A3 Ap A4 A5 A6 Ap B1 B2 Bp B3 B4 B5 Bp B6 C1 Cp C2 C3 C4 Cp C5 C6 Dp D1 D2 D3 Dp D4 D5 D6
The configuration of the RAID sets will impact the overall fault tolerance. A construction of three seven-drive RAID 5 sets has higher capacity and storage efficiency, but can only tolerate three maximum potential drive failures. Because the reliability of the system depends on quick replacement of the bad drive so the array can rebuild, it is common to construct three six-drive RAID5 sets each with a hot spare that can immediately start rebuilding the array on failure. This does not address the issue that the array is put under maximum strain reading every bit to rebuild the array precisely at the time when it is most vulnerable. A construction of seven three-drive RAID 5 sets can handle as many as seven drive failures but has lower capacity and storage efficiency.
RAID 50 improves upon the performance of RAID 5 particularly during writes, and provides better fault tolerance than a single RAID level does. This level is recommended for applications that require high fault tolerance, capacity and random positioning performance.
As the number of drives in a RAID set increases, and the capacity of the drives increase, this impacts the fault-recovery time correspondingly as the interval for rebuilding the RAID set increases.
[edit] RAID 60 (RAID 6+0)
A RAID 60 combines the straight block-level striping of RAID 0 with the distributed double parity of RAID 6. That is, a RAID 0 array striped across RAID 6 elements.It requires at least 8 disks.
Below is an example where two collections of 240 GB RAID 6s are striped together to make 480 GB of total storage space:
RAID 0 .------------------------------------. | | RAID 6 RAID 6 .--------------------------. .--------------------------. | | | | | | | | 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB 120 GB A1 A2 Aq Ap A3 A4 Aq Ap B1 Bq Bp B2 B3 Bq Bp B4 Cq Cp C1 C2 Cq Cp C3 C4 Dp D1 D2 Dq Dp D3 D4 Dq
As it is based on RAID 6, two disks from each of the RAID 6 sets could fail without loss of data.Also failures while a single disk is rebuilding in one RAID 6 set will not lead to data loss.RAID 60 has improved fault tolerance and it is improbable to lose data as more than half of the disks must fail in the above example in order to lose data.
Striping helps to increase capacity and performance without adding disks to each RAID 6 set (which would decrease data availability and could impact performance).RAID 60 improves upon the performance of RAID 6.Despite that RAID 60 is slightly worse than RAID 50 in terms of writes due to the added overhead of more parity calculations, but may be slightly faster in random reads due to the spreading of data over at least one more disk per RAID 6 set.When data security is concerned this performance drop is negligible.