In part one of this two part series, we revisited where RAID (Redundant Array of Independent/Inexpensive Disks) had its roots, as well as reviewing where it is today. In this article, we’ll look more at where RAID and its many variations are today, and forecast future directions. This includes traditional implementations and new enhanced extended parity protection including erasure codes among others. As in the past, these and other approaches are being applied to Hard Disk Drives (HDD) and Solid State Device (SSD) to enhance their availability and accessibility or in some cases performance.
Revisiting RAID 5 and wide stripe or RAID groups
Let us begin with an example from part one in this series of a 15+1 or sixteen-drive RAID 5 group. For some applications and RAID 5 (or RAID 4 or RAID 6) implementations, a 15+1 (or wider) stripe or group might be sufficient.
However, writes could become a bottleneck particular if there is no mirrored or battery protected write back cache (WBC). Another common myth comes into play that all RAID 5 implementations cause extra write IO activity. While some RAID implementations in hardware or software can result in extra back-end writes (e.g. write amplification), this is not true for all, particularly those with good write gathering capabilities.
Some hardware and software implementations using WBC (mirrored or battery backed-BBU) can group writes together in memory (cache) to do full stripe writes. The result can be fewer back-end writes compared to other systems. Hence, not all RAID implementations in either hardware or software are the same. Likewise, just because a RAID definition shows a particular theoretical implementation approach does not mean all vendors have implemented it in that way.
RAID: Extra Writes?
So, does RAID cause extra writes or write amplification?
That depends on the particular RAID level, along with implementation and, in some cases, configuration, including chunk or shard size. For example, RAID 1 (mirroring and replication) does two or more writes in parallel, which would be the same as copying to two disk drives. Granted some implementations may do the writes in real-time (synchronous) or time-deferred (asynchronous, lazy write, or eventual consistency) modes in addition to leveraging WBC.
Do the dual writes of a RAID 1 implementation mean that there are double the number of writes (or triple with three drive mirrors)?
That, too, depends on if comparing a single JBOD with no copy protection or not. On the other hand, depending on the RAID 4, RAID 5, RAID 6 or other approach, there can be extra writes depending on how the vendor has implemented the hardware or software. Thus, there are many apples to oranges comparisons when it comes to RAID. This factor fuels some of the myths, realities and FUD.
What about RAID, writes and SSD?
Again, that is going to depend on which RAID level, as well as the vendor’s hardware or software implementation, and how the vendor has integrated SSD wear leveling for endurance and performance optimization. This will also vary depending on if you’re using an enterprise storage system or appliance vs. software in a server, workstation or desktop.
Thus, some RAID levels, their implementation and how they’re configured, can cause more writes, resulting in more SSD wear. On the other hand, some RAID levels and implementations do a better job vs. others on write gathering, integration with SSD nand flash wear leveling to improve duty cycle vs. others.
Balancing performance, availability, capacity and economics (PACE)
Something else to mention is that while a 15+1 or sixteen drive RAID group has a low space capacity parity protection overhead, there is also the exposure of if (or when) a drive fails. Depending on the RAID hardware or software along with types and sizes of drives, during a long rebuild operation the RAID set is exposed and at risk of a secondary or double drive failure.
So there is a balancing act of trying to cut costs by using large capacity drives in a large or wide RAID 5 configuration to avoid space capacity protection overhead. However, it also opens a potential point of exposure.
Options include narrow RAID 5 groups, more reliable and faster drives to minimize exposure during a rebuild, using a different RAID level such as RAID 6 (dual parity) or other approaches, depending on specific needs and concerns. There are also hybrid enhanced (or beyond traditional) RAID solutions. For example, some hybrid solutions leverage underlying disk pools as part of a RAID 6 configuration, yet combine rapid rebuilds similar to those found with erasure code parity systems. An example is the NetApp E-Series Dynamic Disk Pools capabilities, where failed drives can be rebuilt in a fraction of the time of a traditional RAID 6 solution.
Given that many other vendors implement some variation of a disk pool or virtual volumes underneath or as part of their RAID implementation stacks (independent of LUNs), I would not be surprised to see others adding similar capabilities.
The RAID rebuild conundrum
The challenge I see with RAID and long rebuild is in part tied to the technology implementation, as well as configuration and acquisition decisions. These are also usually tied to cost cutting storage decision-making choices to boost space capacity with performance and availability for normal running conditions.
What I find interesting is that in the race to support more capacity – while cutting costs and maintaining some level of performance – we may have come full circle with RAID. I find it ironic that one of the original premises or objectives for RAID was to use multiple HDDs working together to offset the lack of resiliency or reliability of the then new and emerging lower cost SCSI disk drives. When I talk with people who are actually encountering long RAID rebuilds (versus those who just repeat what they hear or read) have a common theme of using low cost high capacity drives.
While it is easy to point the finger at RAID in general for long rebuilds, particular for parity-based protection, there are contributing factors to consider. For example, if using RAID 1 (Mirroring), the rebuild times should be faster than when using parity based approaches. The reason – and this can be implementation dependent – is that with mirroring essentially a full drive-to-drive copy, or resynchronize, can occur taking less time. On the hand with a parity-based rebuild, the contents of the surviving drives need to be read along with parity information to regenerate data for the new drive. The regeneration using parity takes time due to IOs as well as the mathematical computations that have to occur. Some implementations do a better job than others, however when using parity based data protection, the benefit can be lower space capacity overhead in exchange for longer rebuild times.
Then there are the other contributors to long rebuild times, which include the space capacity size of the drive, its interface along with performance capability to handle reads (or writes if the destinations), any RAID rebuild or copy assist features (some drives have this if supported by the controllers), and the controller or software implementation.
What to do about long rebuild?
One simple option, which might be easier said than done, is to avoid them or, more pragmatically, move and mask the impact.
Another option is to understand why rebuilds are occurring in the first place. For example, are you hearing of rebuilds from others and are thus concerned, or are you actually encountering them on a regular basis? If you are encountering drive rebuilds, how often are drives failing, and do you understand why drives are failing? Are the drives actually failing or have failed, or is the RAID implementation in hardware or software detecting a possible problem? Or are you being overly cautious if not simply doing false rebuilds due to firmware or software settings by the vendor?
The type of drives, enclosures and adapters, and controllers along with their associated software, can also have an impact, with some being more resilient than others are.
Something that often is not brought into the conversation around rebuilds is the choice or selection of drives and the controllers or software that they attach to. Simply put, use cheap, poor quality large capacity drives that may be more susceptible to failure attached to hardware controllers or software with poor implementations and you may save a few dollars, yet spend some time during rebuilds.
What this means is that while some systems can rebuild a 1TB or 2TB drive in less time than it took to process a 9GB drive a decade ago (not to mention drives can be more reliable), the drives are getting large in terms of space capacity, and there are more of them. Likewise, there is a trend to use lower cost drives in configurations such as RAID 5 to maximize space capacity with some availability while reducing cost as much as possible.