remain expensive. This evolution is
reflected in the subtle but important
mutation of the meaning of the I in
RAID from inexpensive to independent
that took place in the mid-1990s (
indeed, it was those same SLED manufacturers that instigated this shift to
apply the new research to their existing products).
In 1993, Gibson, Katz, and Patterson, along with Peter Chen, Edward
Lee, completed a taxonomy of RAID
levels that remain unamended to date.
3
Of the seven RAID levels described,
only four are commonly used:
RAID-0. ˲ Data is striped across devices for maximal write performance.
It is an outlier among the other RAID
levels as it provides no actual data protection.
RAID- 1. ˲ Disks are organized into
mirrored pairs and data is duplicated
on both halves of the mirror. This is
typically the highest-performing RAID
level, but at the expense of lower usable capacity. (The term RAID- 10 or
figure 1. comparison of RAiD- 5 and RAiD- 6 reliability.
1
Data loss Probability1
6x8-drive RAID- 5 vs. 3x16-drive RAID- 6
raId- 5
raId- 6
30%
25%
20%
raId-5:
~ 1 in 4
Chance of
data loss
24.04%
15%
12.71%
10%
5%
raId-6:
~3800´
better than
raId- 5
0.79%
0.00004%
147Gb
15K RPM FC
1.60%
0.00015%
0.00160%
0.00639%
0%
300Gb
15K RPM FC
250Gb
7200 RPM sATA
500Gb
7200 RPM sATA
figure 2. Historical capacity/Throughput of 7200 RPm sATA HDDs.
Capacity (Gb)
2500
2000
capacity (GB)
1500
1000
500
50
throughput (Mb/s)
250
200
150
100
Throughput (mB/s)
0
1996
1998
2000
2002
2004
2006
2008
0
2010
figure 3. Historical capacity/Throughput of 10k RPm fc HDDs.
Capacity (Gb)
capacity (GB)
450
400
350
300
250
200
150
100
50
0
1998
throughput (Mb/s)
250
200
150
100
Throughput (mB/s)
50
2000
2002
2004
2006
2008
0
RAID- 1+0 is used to refer to a RAID
configuration in which mirrored pairs
are striped, and RAID-01 or RAID-0+ 1
refer to striped configurations that
are then mirrored. The terms are of
decreasing relevance since striping
over RAID groups is now more or less
assumed.)
RAID- 5. ˲ A group of N+ 1 disks is
maintained such that the loss of any
one disk would not result in data loss.
This is achieved by writing a parity
block, P, for each logical row of N disk
blocks. The location of this parity is
distributed, rotating between disks
so that all disks contribute equally to
the delivered system performance.
Typically P is computed simply as the
bitwise XOR of the other blocks in the
row.
RAID- 6. ˲ This is like RAID- 5, but
employs two parity blocks, P and Q, for
each logical row of N+ 2 disk blocks.
There are several RAID- 6 implementations such as IBM’s EVENODD,
2 NetApp’s Row-Diagonal Parity,
4 or more
generic Reed-Solomon encodings.
10
(Chen et al. refer to RAID- 6 as P+Q redundancy, which some have taken to
imply P data disks with an arbitrary
number of parity disks, Q. In fact,
RAID- 6 refers exclusively to double-parity RAID; P and Q are the two parity
blocks.) For completeness, it’s worth
noting the other less prevalent RAID
levels:
RAID- 2. ˲ Data is protected by mem-ory-style ECC (error correcting codes).
The number of parity disks required is
proportional to the log of the number
of data disks; this makes RAID- 2 relatively inflexible and less efficient than
RAID- 5 or RAID- 6 while also delivering
lower performance and reliability.
RAID- 3. ˲ As with RAID- 5, protection is provided against the failure of
any disk in a group of N+ 1, but blocks
are carved up and spread across the
disks—bitwise parity as opposed to
the block parity of RAID- 5. Further,
parity resides on a single disk rather
than being distributed between all
disks. RAID- 3 systems are significantly less efficient than with RAID- 5 for
small read requests; to read a block
all disks must be accessed; thus the
capacity for read operations is more
readily exhausted.
RAID- 4. ˲ This is merely RAID- 5,
but with a dedicated parity disk rather