Moving storage services into storage devices may not be what storage software vendors want, but it will offer many benefits to enterprises.
Falling processor and memory prices mean it’s economically feasible to beef up the computing power on storage hardware devices. That’s opening up some exciting possibilities for smart flash drives.
To understand why, you need to consider what exactly goes on in solid state drives. Unlike the spinning hard drives, flash drives can’t overwrite any arbitrary area of the storage medium. In particular, they can’t write new data to a partially used block – they have to write to a previously erased (or never used) block.
The upshot of this is that to look like a traditional disk to the user, to the operating system, to the file system and to any applications, the flash drive’s firmware has to do some pretty clever stuff. This includes virtualizing physical blocks and keeping track of the mapping between the physical locations and these virtualized blocks. This is done through a Flash Translation Layer (FTL).
“Because the FTL is like a file system, and because flash drives are getting increasingly clever software and more computational power than in the past, many good things can happen!” says Christos Karamanolis, the chief architect and principal engineer for storage and availability at VMware.
What kind of good things? One example is related to the atomicity of data writes, which guarantees the integrity of data written to a storage medium. Today, protocols like SCSI guarantee that either all or none of a data write will actually happen. Without that there might be a mixture of old and new data written to a disk, or data may be written but its metadata may not be updated. And that leads to data corruption.
File systems take advantage of atomicity to avoid corruption, but to achieve it requires a good deal of computation. That creates processor overhead, and it adds a high level of complexity to the storage software system.
“Many storage researchers are asking: What if the storage device could guarantee the atomicity of multiple physical blocks, so the file system could just say, Either update both the data and the metadata, or don’t do anything?” says Karamanolis.
The benefit of that, he believes, would be a significant reduction in software complexity. This would result in more reliable software, and cheaper software and services (as vendors don’t have to invest in such long development cycles.)
“At first these drives would be more expensive, but as they become commoditized and everyone writes to the common interfaces, they would become cheaper,” he adds.
In fact this move towards smarter drives has already started, Karamanolis points out, and cites the example of Seagate’s Kinetic disk drive which offers an Ethernet interface and a Key: Value store. Other examples include solid state drives that offer built in encryption and compression in the firmware of the devices themselves.
Another example, for non-volatile storage, is the development of the NVM Express (NVMe) interface for PCI Express (PCIe) storage devices. This brings many advantages over SAS and SATA, because it has been designed specifically for flash storage devices.
“This will introduce stronger semantics for the operating system and file system, like Atomic Test and Set, which is very helpful for scattered writes and gathered reads,” says Karamanolis. “This is convenient for building less complex, more efficient software.”
These are non-trivial features, which will require some serious software to run in each storage device. That means, inevitably, that the CPU and memory – and associated buses – will have to be upgraded in future solid state storage devices to support this.
“The drives of today are already mini computers in their own right, and what we will continue to see is processing power moving close to where the data is actually stored,” said Karamanolis.
In fact this idea is not altogether new: engineers at Carnegie-Melon University looked at the idea of intelligent storage devices that implemented software features in hardware back in the nineties, but the scale and cost of hardware was such that it simply wasn’t feasible.
Looking a little way in to the future, there is no technical reason why solid state drives won’t be powerful enough to offer a range of storage services like snapshotting , cloning, and deduplication within their own firmware.
And if storage devices are really doing all the work, then storage subsystems may not be needed at all. Servers would simply talk to direct attached storage or networked drives, tell them what storage services to carry out, and leave them to it.
“I certainly think it’s possible,” says Mark Peters, senior analyst at Enterprise Strategy Group. “After all, one of the main reasons we moved the storage stack “out” was simply pragmatic – there used not to be sufficient engine power on mainframes and then servers to do it more centrally. That has now changed and of course we want more storage functionality to be closer to the apps and processing.”