One more modification was made to the software running on the host: write units were required to be the same size as SSD erase blocks. This essentially gets rid of the need for wear levelling and overprovisioning altogether.
The result was that the host software, given direct access to the raw flash channels of the SSD, could organize its data and schedule its data access to better realize the SSD’s raw performance potential. The metrics are impressive: I/O bandwidth increased by 300%, and the storage cost per gigabyte with overprovisioning removed was cut by 50%. The company believes that the system now delivers 95% of the raw flash bandwidth, and 99% of the raw flash capacity is now usable.
The big question then is when software-defined flash technology will filter down to enterprises who would like to benefit from lower costs and higher performance, but who are unlikely to be able to modify their own applications, operating systems and SSD firmware in the way that Baidu has done.
The good news is that the International Committee for Information Technology Standards’ (INCITS) T10 (SAS and SCSI) and T13 (ATA) technical committees and the NVM Express industry association are all moving towards a standards initiative for storage intelligence that will allow otherwise standard SSDs to offload control of functions to the host.
There’s also an open source specification under development called LightNVM that allows a host to manage data placement, garbage collection and parallelism on “open-channel SSDs.” These are devices that share responsibilities with the host in order to implement and maintain features that typical SSDs keep strictly in firmware. According to LightNVM, a number of open-channel SSDs are under development, as is a storage platform that supports LightNVM from a company currently in stealth mode called CNEX Labs.
It’s impossible to predict when practical, usable standards will emerge, but in the meantime established vendors aren’t sitting around doing nothing. For example OCZ’s Saber 1000 SSD storage system, announced in October 2015, controls, manages and coordinates garbage collection, wear levelling, log dumps and other background tasks on connected SSDs. And it can choose to send data to drives only when they are not occupied in housekeeping activities that negatively impact performance.
The reverse is also true: when the system knows that it is not about to send data to the drives, it can tell them to use the opportunity afforded by this lull in write activity to carry out any housekeeping tasks that need doing.
The system uses special SSDs that include what OCZ calls Host Managed SSD (HMS) technology.
“We are essentially removing all the performance hits (associated with carrying out background tasks while receiving data) from the entire pool’s operation, thereby increasing performance and performance consistency and the latency of the entire aggregated pool,” explains Grant Van Patten, OCZ’s product manager.
There’s no question that this type of approach is different from Baidu’s software-defined flash setup in that Baidu’s system does away with the need for garbage collection and wear levelling by altering its software to make it SSD-aware. With OCZ’s approach, these housekeeping activities still have to happen, but the storage system manages when SSDs carry them out in a performance-efficient fashion, rather than allowing each SSD to think for itself without regard for the performance implications of doing so.
The OCZ approach has the benefit that no modifications are required to the operating system or applications running on it.
The Baidu approach may well suit companies (like Baidu) which run a small number of applications at vast scale, and which have the internal resources to modify their applications and operating systems to make use of suitably programmed SSDs.
But for enterprise and even SMB usage, it’s more likely that something like OCZ’s approach will prevail. Storage software will control and optimize the behavior of SSDs that comply with some software-defined flash standard to increase their performance, while applications and operating systems will continue to believe that the storage they are addressing is made up of conventional spinning disks.