Nested RAID levels

From Wikipedia, the free encyclopedia

Main article: RAID

To gain performance and/or additional redundancy the Standard RAID levels can be combined to create hybrid or Nested RAID levels.

Contents

[edit] Nesting

When nesting RAID levels, a RAID type that provides redundancy is typically combined with RAID 0 to boost performance. With these configurations it is preferable to have RAID 0 on top and the redundant array at the bottom, because fewer disks then need to be regenerated when a disk fails. (Thus, RAID 10 is preferable to RAID 0+1 but the administrative advantages of "splitting the mirror" of RAID 1 would be lost).

[edit] RAID 0+1

Typical RAID 0+1 setup.
Typical RAID 0+1 setup.

A RAID 0+1 (also called RAID 01, not to be confused with RAID 1), is a RAID used for both replicating and sharing data among disks. The difference between RAID 0+1 and RAID 1+0 is the location of each RAID system — RAID 0+1 is a mirror of stripes. The size of a RAID 0+1 array can be calculated as follows where n is the number of drives (must be even) and c is the capacity of the smallest drive in the array:

\begin{align}Size & = \left(n \times c\right) \div 2\end{align}

Consider an example of RAID 0+1: six 120 GB drives need to be set up on a RAID 0+1. Below is an example where two 360 GB level 0 arrays are mirrored, creating 360 GB of total storage space:

                       RAID 1
            .--------------------------.
            |                          |
          RAID 0                     RAID 0
   .-----------------.        .-----------------.
   |        |        |        |        |        |
120 GB   120 GB   120 GB   120 GB   120 GB   120 GB
  A1       A2       A3       A1       A2       A3
  A4       A5       A6       A4       A5       A6
  A7       A8       A9       A7       A8       A9
  A10      A11      A12      A10      A11      A12

Note: A1, A2, et cetera each represent one data block; each column represents one disk.

The maximum storage space here is 360 GB, spread across two arrays. The advantage is that when a hard drive fails in one of the level 0 arrays, the missing data can be transferred from the other array. However, adding an extra hard drive to one stripe requires you to add an additional hard drive to the other stripes to balance out storage among the arrays.

It is not as robust as RAID 10 and cannot tolerate two simultaneous disk failures, unless the second failed disk is from the same stripe as the first. That is, once a single disk fails, each of the mechanisms in the other stripe is single point of failure. Also, once the single failed mechanism is replaced, in order to rebuild its data all the disks in the array must participate in the rebuild.

The exception to this is if all the disks are hooked up to the same raid controller in which case the controller can do the same error recovery as RAID 10 as it can still access the functional disks in each RAID 0 set. If you compare the diagrams between RAID 0+1 and RAID 10 and ignore the lines above the disks you will see that all that's different is that the disks are swapped around. If the controller has a direct link to each disk it can do the same. In this one case there is no difference between RAID 0+1 and RAID 10.

Additionally, bit error correction technologies have not kept up with rapidly rising drive capacities, resulting in higher risks of encountering media errors. In the case where a failed drive is not replaced in a RAID 0+1 configuration, a single uncorrectable media error occurring on the mirrored hard drive would result in data loss.

Given these increasing risks with RAID 0+1, many business and mission critical enterprise environments are beginning to evaluate more fault tolerant RAID setups that add underlying disk parity. Among the most promising are hybrid approaches such as RAID 51 (mirroring above single parity) or RAID 61 (mirroring above dual parity).

[edit] RAID 1+0

Typical RAID 10 setup.
Typical RAID 10 setup.

A RAID 1+0, sometimes called RAID 1&0, or RAID 10, is similar to a RAID 0+1 with exception that the RAID levels used are reversed — RAID 10 is a stripe of mirrors. Below is an example where three collections of 120 GB level 1 arrays are striped together to make 360 GB of total storage space:

                       RAID 0
       .-----------------------------------.
       |                 |                 |
     RAID 1            RAID 1            RAID 1
   .--------.        .--------.        .--------.
   |        |        |        |        |        |
120 GB   120 GB   120 GB   120 GB   120 GB   120 GB
  A1       A1       A2       A2       A3       A3
  A4       A4       A5       A5       A6       A6
  A7       A7       A8       A8       A9       A9
  A10      A10      A11      A11      A12      A12
Note: A1, A2, et cetera each represent one data block; each column represents one disk.

All but one drive from each RAID 1 set could fail without damaging the data. However, if the failed drive is not replaced, the single working hard drive in the set then becomes a single point of failure for the entire array. If that single hard drive then fails, all data stored in the entire array is lost. As is the case with RAID 0+1, if a failed drive is not replaced in a RAID 10 configuration then a single uncorrectable media error occurring on the mirrored hard drive would result in data loss. Some RAID 10 vendors address this problem by supporting a "hot spare" drive, which automatically replaces and rebuilds a failed drive in the array.

Given these increasing risks with RAID 10, many business and mission critical enterprise environments are beginning to evaluate more fault tolerant RAID setups that add underlying disk parity. Among the most promising are hybrid approaches such as RAID 51 (mirroring above single parity) or RAID 61 (mirroring above dual parity).

RAID 10 is often the primary choice for high-load databases, because the lack of parity to calculate gives it faster write speeds.

RAID 10 Capacity: (Size of Smallest Drive) * (Number of Drives) / 2

The Linux kernel RAID10 implementation (from version 2.6.9 and onwards) is not nested. The mirroring and striping is done in one process. Only certain layouts are standard RAID 10 with the rest being proprietary. See the Linux MD RAID 10 section in the Non-standard RAID article for details.

[edit] Raid 0+3 and 3+0

[edit] RAID 0+3

Diagram of a 0+3 array
Diagram of a 0+3 array

RAID level 0+3 or RAID level 03 is a dedicated parity array across striped disks. Each block of data at the RAID 3 level is broken up amongst RAID 0 arrays where the smaller pieces are striped across disks.


[edit] RAID 30

RAID level 30 is also known as striping of dedicated parity arrays. It is a combination of RAID level 3 and RAID level 0. RAID 30 provides high data transfer rates, combined with high data reliability. RAID 30 is best implemented on two RAID 3 disk arrays with data striped across both disk arrays. RAID 30 breaks up data into smaller blocks, and then stripes the blocks of data to each RAID 3 raid set. RAID 3 breaks up data into smaller blocks, calculates parity by performing an Exclusive OR on the blocks, and then writes the blocks to all but one drive in the array. The parity bit created using the Exclusive OR is then written to the last drive in each RAID 3 array. The size of each block is determined by the stripe size parameter, which is set when the RAID is created.

One drive from each of the underlying RAID 3 sets can fail. Until the failed drives are replaced the other drives in the sets that suffered such a failure are a single point of failure for the entire RAID 30 array. In other words, if one of those drives fails, all data stored in the entire array is lost. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set.

           /------/------/------/------> RAID CONTROLLER <------\-------\------\------\
           |      |      |      |                               |       |      |      |
         disk1  disk2   disk3  disk4                         disk5   disk6   disk7  disk 8
           |      |      |      |                              |       |      |       |
          A1      A2     A3     P                             A4       A5    A6      P1
           |      |      |      |                              |       |      |       | 
          A7      A8     A9     P                             A10      A11   A12     P1
           |      |      |      |                              |       |      |       |
          A13     A14   A15     P                             A16      A17   A18     P1
           -------RAID 3---------                          ----------RAID 3---------
                 ----------------------- RAID 0 ----------------------------

[edit] RAID 100 (RAID 10+0)

A RAID 100, sometimes also called RAID 10+0, is a stripe of RAID 10s. This is logically equivalent to a wider RAID 10 array, but is generally implemented using software RAID 0 over hardware RAID 10. Being "striped two ways", RAID 100 is described as a "plaid RAID".[1] Below is an example in which two sets of four 120 GB RAID 1 arrays are striped and re-striped to make 480 GB of total storage space:

                                RAID 0
               .-------------------------------------.
               |                                     |
             RAID 0                                RAID 0
       .-----------------.                  .-----------------.
       |                 |                  |                 |
     RAID 1            RAID 1             RAID 1            RAID 1
   .--------.        .--------.         .--------.        .--------.
   |        |        |        |         |        |        |        |
120 GB   120 GB   120 GB   120 GB    120 GB   120 GB   120 GB   120 GB
  A1       A1       A2       A2        A3       A3       A4       A4
  A5       A5       A6       A6        A7       A7       A8       A8
  B1       B1       B2       B2        B3       B3       B4       B4
  B5       B5       B6       B6        B7       B7       B8       B8
Note: A1, B1, et cetera each represent one data sector; each column represents one disk.

The failure characteristics are identical to RAID 10: all but one drive from each RAID 1 set could fail without loss of data. However, the remaining disk from the RAID 1 becomes a single point of failure for the already degraded array. Often the top level stripe is done in software. Some vendors call the top level stripe a MetaLun (Logical Unit Number (LUN)), or a Soft Stripe.

The major benefits of RAID 100 (and plaid RAID in general) over single-level RAID is spreading the load across multiple RAID controllers, giving better random read performance and mitigating hotspot risk on the array. For these reasons, RAID 100 is often the best choice for very large databases, where the hardware RAID contollers limit the number of physical disks allowed in each standard array. Implementing nested RAID levels allows virtually limitless spindle counts in a single logical volume.

[edit] RAID 50 (RAID 5+0)

A RAID 50 combines the straight block-level striping of RAID 0 with the distributed parity of RAID 5. This is a RAID 0 array striped across RAID 5 elements.

Below is an example where three collections of 240 GB RAID 5s are striped together to make 720 GB of total storage space:

                                     RAID 0
            .-----------------------------------------------------.
            |                          |                          |
          RAID 5                     RAID 5                     RAID 5
   .-----------------.        .-----------------.        .-----------------.
   |        |        |        |        |        |        |        |        |
120 GB   120 GB   120 GB   120 GB   120 GB   120 GB   120 GB   120 GB   120 GB
  A1       A2       Ap       A3       A4       Ap       A5       A6       Ap
  B1       Bp       B2       B3       Bp       B4       B5       Bp       B6
  Cp       C1       C2       Cp       C3       C4       Cp       C5       C6
  D1       D2       Dp       D3       D4       Dp       D5       D6       Dp
Note: A1, B1, et cetera each represent one data block; each column represents one disk; Ap, Bp,
      et cetera each represent parity information for each distinct RAID 5 and may represent different
      values across the RAID 5 (that is, Ap for A1 and A2 can differ from Ap for A3 and A4).

One drive from each of the RAID 5 sets could fail without loss of data. However, if the failed drive is not replaced, the remaining drives in that set then become a single point of failure for the entire array. If one of those drives fails, all data stored in the entire array is lost. The time spent in recovery (detecting and responding to a drive failure, and the rebuild process to the newly inserted drive) represents a period of vulnerability to the RAID set.

In the example below, datasets may be striped across both RAID sets. A dataset with 5 blocks would have 3 blocks written to the first RAID set, and the next 2 blocks written to RAID set 2.

  RAID Set 1             RAID Set 2
A1  A2  A3  Ap         A4  A5  A6  Ap
B1  B2  Bp  B3         B4  B5  Bp  B6
C1  Cp  C2  C3         C4  Cp  C5  C6
Dp  D1  D2  D3         Dp  D4  D5  D6

The configuration of the RAID sets will impact the overall fault tolerance. A construction of three seven-drive RAID 5 sets has higher capacity and storage efficiency, but can only tolerate three maximum potential drive failures. Because the reliability of the system depends on quick replacement of the bad drive so the array can rebuild, it is common to construct three six-drive RAID5 sets each with a hot spare that can immediately start rebuilding the array on failure. This does not address the issue that the array is put under maximum strain reading every bit to rebuild the array precisely at the time when it is most vulnerable. A construction of seven three-drive RAID 5 sets can handle as many as seven drive failures but has lower capacity and storage efficiency.

RAID 50 improves upon the performance of RAID 5 particularly during writes, and provides better fault tolerance than a single RAID level does. This level is recommended for applications that require high fault tolerance, capacity and random positioning performance.

As the number of drives in a RAID set increases, and the capacity of the drives increase, this impacts the fault-recovery time correspondingly as the interval for rebuilding the RAID set increases.


[edit] RAID 60 (RAID 6+0)

A RAID 60 combines the straight block-level striping of RAID 0 with the distributed double parity of RAID 6. That is, a RAID 0 array striped across RAID 6 elements. It requires at least 8 disks.

Below is an example where two collections of 240 GB RAID 6s are striped together to make 480 GB of total storage space:

                                RAID 0
                .------------------------------------.
                |                                    |
              RAID 6                               RAID 6
   .--------------------------.        .--------------------------.
   |        |        |        |        |        |        |        |
120 GB   120 GB   120 GB   120 GB   120 GB   120 GB   120 GB   120 GB
  A1       A2       Aq       Ap       A3       A4       Aq       Ap
  B1       Bq       Bp       B2       B3       Bq       Bp       B4
  Cq       Cp       C1       C2       Cq       Cp       C3       C4
  Dp       D1       D2       Dq       Dp       D3       D4       Dq

As it is based on RAID 6, two disks from each of the RAID 6 sets could fail without loss of data. Also failures while a single disk is rebuilding in one RAID 6 set will not lead to data loss. RAID 60 has improved fault tolerance and it is improbable to lose data as more than half of the disks must fail in the above example in order to lose data.

Striping helps to increase capacity and performance without adding disks to each RAID 6 set (which would decrease data availability and could impact performance). RAID 60 improves upon the performance of RAID 6. Despite that RAID 60 is slightly worse than RAID 50 in terms of writes due to the added overhead of more parity calculations. When data security is concerned this performance drop is negligible.

[edit] See also

[edit] References