RAID — redundant array of independent disks — as a technology for protecting data remains alive and relevant even after 25 years since the original Berkeley white paper appeared. Granted, some RAID implementations and the systems that they are apart of — along with how they’re configured — may be more dated and limited vs. others.
Like the Hard Disk Drive (HDD) that RAID (Redundant Array of Inexpensive (or Independent) Disks) is most commonly associated and used with, both have been declared dead for years if not decades, although both remain very much alive.
What this means is that some vendors’ hardware- or software-based RAID solutions continue to evolve with new functionality, capabilities, ability to scale performance (IOPS, bandwidth, latency), availability, capacity and effectiveness, while some others remain static. This also means that those using or making storage decisions have options to use RAID in new ways vs. how they have in the past, assuming their particular chosen technology solution is flexible enough to do so.
RAID revisited
Let’s take a quick step back and revisit RAID fundamentals so that we can step forward to see where it is today and will be tomorrow. The original premise in the December 1987 University of California Berkley white paper titled “A Case for Redundant Arrays of Inexpensive Disks (RAID)“ (PDF version here) was to overcome limits and barriers of what were called Single Large Expensive Disk (SLED). At that time, magnetic HDD were physically large, prone to failures, limited space capacity, with performance bottlenecks, not to mention having propriety access methods and protocols. Besides the IBM Mainframe model 3380 HDD, the industry standard HDD that was OEMd by a variety of different vendors was the Fujitsu Eagle.
The Fujitsu Eagle was (depending on model) a 470MByte (Raw unformatted) 6U (10.5”) device with multiple 10.5” diameter platters that spun at under 4,000 RPMs, consuming over 500 watts of power and a then bargain price tag of around $10,000. Put this into perspective of today’s enterprise class high performance 2.5” 15K 600GByte HDDs that consumed around 8 watts while in use along with much more affordable price tags (depending on where or who you buy them from).
Keep in mind this was in an era when the SCSI (parallel) HDD was just emerging and things like ATA, PATA, SATA, Fibre Channel, iSCSI, SAS as interfaces were at best a futuristic pipe dream, not to mention a 1GByte HDD still being out over the future horizon. This was also the era just a few years before a mid-tier VAX/VMS 128MByte Solid State Device (SSD) using DRAM cost about $100,000 USD ($178,941.85 today if adjusted for inflation).
Let us get back to SLED and emerging SCSI or smaller. The current 2.5” and 3.5” HDDs (and SSDs) are decedents from their predecessor 5.25” drives, which back in 1987 were just emerging. As mentioned, a common theme was something similar to today of I/O performance not keeping up with space capacity. Thus, the Berkeley white paper presented the initial five RAID levels, which over time have expanded and evolved, not to mention being enhanced by various vendor implementations.
RAID 0
RAID O stripes data across all drives for read and write performance while increasing space capacity vs. that of a single drive. There is no data protection with RAID 0 which also means that there is no space capacity
overhead, loss of a drive results in the entire RAID 0 set being unusable. This means loss of a single HDD results in the entire RAID group being impacted. This is also sometimes referred to as JBOD or Just a Bunch of Disks mode by some vendors particular if using only one drive per RAID group.
RAID 1
RAID I mirrors or replicates two or more drives for protection and possible read performance depending on the implementation. Some implementations enable multiple concurrent reads to occur from different drives, as well as having three or more mirrors.
Write performance, assuming equal type of drives, should be about the same as writing to a single JBOD. However, different implementations will vary and some systems with write back cache or other optimizations may be even faster.
There is a catch, which is that RAID 1 has a space capacity overhead for protection of one to n where n is the number of copies. The benefit is that if a drive fails or is removed, the entire remaining drive is intact, however it is effectively running in JBOD (not protected mode). An option is to setup a triple mirror such that if one drive fails, there are two surviving drives. And when a spare is added or a failed drive replaced, a copy vs. rebuild can occur.
Yet another variation is to use a quad or four drive mirror, or a two drive mirror in conjunction with remote mirroring, that is, replication to another storage system local on or off-site. As a result, RAID 1 is a very popular option used today for both HDD and SSD where a balance of performance and availability are needed.