RAID Remains Relevant, Really!

Posted on October 28, 2013 By Greg Schulz

RssImageAltText

RAID 2

Hamming code for error correction up until recently has been the least known and adopted RAID level given its complexity and compute cost. The reason is that this approach and its variations -- including erasure or forward error correction -- use more advanced algorithms to create multiple parities that can be used to reduce space overhead, yet require more compute power.

Variations of RAID 2 and extended parity protection are finding new opportunities with improvements of compute processing capability and ability to leverage larger number of drives with, for example, erasure codes, forward error correction and other algorithmic protection.

RAID 3

Stripe with dedicated parity found success in the mid 90’s with solutions such as those from Baydell that provide good sequential reads and writes that were well suited for video and similar applications. But more recently as in the past decade and a half, use of RAID 3 has dramatically decreased with the continued maturing of RAID 4, RAID 5, RAID 6 and other I/O optimization techniques.

RAID 4

Stripe with dedicated parity and independent reads and writes which, in concept, is similar to RAID 3. The difference is that of multiple concurrent I/O operations.

With RAID 3, all disks work in parallel to handle I/O operations where with RAID 4 (and higher) multiple I/O operations could occur. While RAID 4 (and higher) supports full stripe reads and writes, this also means that multiple smaller reads (or writes) can occur. This is also where some confusion and myths come into play based on how different vendors implement their RAID software or hardware solutions.

For example, those who do not do good write buffering or read-ahead and cache management may encounter additional write overhead. On the other hand, those who can do good cache management and attempt to do full stripe writes (or reads) when possible while maintaining data consistency can get better performance.

Thus, look at different solutions and ask vendors how they implement their RAID 4 (or RAID 5 or RAID 6). Do they do write gathering? How are full and partial stripe writes (and reads) handled? What about options for setting chunk (shard) size?

Simply going by the RAID definition may be safe for I/O planning of worse case scenarios or lowest common denominator. But it can also lead to false assertions, hence do your homework.

RAID 5

Stripe data with rotating parity builds off of RAID 4, however it eliminates the dedicated parity disk. Instead, all members of a RAID set take turns storing the parity in a rotating manner so that no one disk becomes bottlenecked.

This means that each different rank or stripe of data will have a different drive handling the parity. The good news is that space capacity overhead can be greatly reduced vs. mirroring (RAID 1) with, for example, a four drive RAID 5 group having a 25% space overhead for parity (3 + 1) with three data and one equivalent parity drive.

However, this also leads to a common RAID 5 myth: that it always has a 25% space overhead, which is only true for those systems or environments that configure it that way.

Different vendors RAID software and hardware support various group sizes, chunking (amount of data written to each drive), and stripe size (number of drives). For example, a sixteen drive 15+1 RAID 5 configuration only has a parity space capacity overhead of 6%. RAID 5 remains popular for some environments and growing in adoption in the lower end SMB and consumer markets where fewer drives are the norm.

RAID 6

Similar to RAID 5, yet adding an additional parity to protect against a double drive failure by providing extra protection. RAID 6 has helped to support adoption of larger capacity 1TB, 2TB, 3TB and now 4TB drives and in wider stripes or larger RAID groups.

However, for some environments even more protection is needed beyond mirroring or replicating a RAID 6 group to another in a different storage system. So for those environments you can actually find RAID variations including RAID 7 (triple parity) along with hybrid RAID 10 & 01 (stripe with mirror, mirror with stripe), or 50 (RAID 5 with underlying stripes) not to mention emerging erasure codes, forward error correction, dispersal and other approaches.

Something else that is occurring as RAID continues to evolve is that the chunk sizes or amount of data written to each drive have evolved from 4K, 8K, 16K, 32K for many systems to much larger -- in some cases being several Mbytes (or more).

There is plenty more to revisit with RAID today, not to mention where this storage technology is going. We’ll take a further look at RAID in part two of this article.  

Originally published on .

Page 2 of 2

1 2
<< Previous Page  

Comment and Contribute
(Maximum characters: 1200). You have
characters left.

InfoStor Article Categories:

SAN - Storage Area Network   Disk Arrays
NAS - Network Attached Storage   Storage Blogs
Storage Management   Archived Issues
Backup and Recovery   Data Storage Archives