Another thing that can be done is to find a balance between the amount of space capacity needed to protect the level required while meeting budget concerns. There are tradeoffs where you can save money or cut cost for normal running storage, however you may pay the price in terms of reduced performance or availability exposure when something does fail.
What you can do is revisit which RAID level (including newer enhanced parity protection schemes) to use when. This also means revisiting the implementation of those protection tools that may be in software, hardware or a combination along with robustness of solutions. For example, what are their predictive and proactive replacements capabilities, rapid rebuild or fast copy assist, including leveraging drive-based acceleration where available
Also, keep in mind the space capacity size of the drive balanced with how much resiliency you need to meet cost and performance objectives. So a solution can be to increase from RAID 5 single parity (survive a single drive failure before a rebuild completes) to RAID 6 with dual parity (survive a double drive failure before rebuild completes) to multiple parity options. Multiple parity options will consume more space capacity overhead, however when used over larger number of drives in wider stripe, RAID groups or protection groups, the impact is minimal vs. the resiliency and benefit they provide.
It should be understood, yet to avoid assuming something that may not be assumed, you can also use different levels of RAID or parity-based protection in combination, or for various application needs. In other words, one size or technology or tool and its implementation does not have to be used for everything. After all, in the data center or information factory not everything is the same.
RAID today and tomorrow
There is an increasing awareness, or revisiting of, chunk and shard size, which is the amount of data that is spread or copied to different devices. For legacy and transactional environments, this tends to be a relative small amount, say 8 Kbyte to 32 Kbyte, perhaps 64 Kbyte in size or larger for environments where the focus shifts from IOPs or transactions to sequential bandwidth or throughput. One of the things that is happening is that chunk or shard sizes are now occurring in the MByte plus size (and larger) to reflect changing I/O patterns (grouped reads and writes), and larger file and objects (videos, audio, virtual machines).
Multiple parity schemes are being used, including erasure code and dispersal algorithms, some in combination with traditional RAID. In these hybrid implementations, the underlying RAID may not be noticeable, nor the lightweight file systems they support, that in turn aggregates and provides basic storage services to upper level data protection software.
Some examples of this include object and scale out storage (HDD and SSD) for cloud and other environments that use erasure codes, dispersal or other forms of data protection at a higher level, yet RAID is used to manage the underlying block JBOD commodity hardware.
Key points and RAID considerations include:
· Not all RAID implementations are the same, some are very much alive and evolving while others are in need of a rest or rewrite. So it is not the technology or techniques that are often the problem, rather how it is implemented and subsequently deployed.
· It may not be RAID that is dead, rather the solution that uses it, hence if you think a particular storage system, appliance, product or software is old and dead along with its RAID implementation, then just say that product or vendors solution is dead.
· RAID can be implemented in hardware controllers, adapters or storage systems and appliances as well as via software and those have different features, capabilities or constraints.
· Long or slow drive rebuilds are a reality with larger disk drives and parity-based approaches; however, you have options on how to balance performance, availability, capacity, and economics.
· RAID can be single, dual or multiple parity or mirroring-based.
· Erasure and other coding schemes leverage parity schemes and guess what umbrella parity schemes fall under.
· RAID may not be cool, sexy or a fun topic and technology to talk about, however many trendy tools, solutions and services actually use some form or variation of RAID as part of their basic building blocks. This is an example of using new and old things in new ways to help each other do more without increasing complexity.
· Even if you are not a fan of RAID and think it is old and dead, at least take a few minutes to learn more about what it is that you do not like to update your dead FUD.
While the basics and foundation of RAID over 25 years ago apply, how and where implemented, not to mention being extended, continues to evolve. This includes supporting larger drives and more of them, using bigger chunk or shard sizes, new parity techniques distributed local and remote.
In addition, this means using different technologies and techniques in new ways. For example, using RAID 6 to protect multiple drives in an object or cloud storage system on the back-end, while above that erasure code or dispersal algorithms are used on a bigger broader scale.
For what it’s worth, walking the talk, I do use some RAID for different things, including RAID 5 on my NAS systems, as well as RAID 1 in VMware servers along with other data protection and availability approaches. This also includes use of block, file and object storage that span HDD, HHDD, SSHD, SSDs and cloud object storage.
That is it, at least for now regarding RAID revisited today and for tomorrow. There is more we will talk about later including some of the emerging themes covered here and in part I of this two part series.