“Software defined pretty-much-anything” is getting a lot of attention in the IT trade press. It seems new and different and promises to solve everyone’s problems around prioritizing applications, sharing storage across massive distributed environments and optimizing computing resources around business data needs.
It’s not quite that easy, nothing is. Software-defined technologies are in fact available today in real products. But software-defined storage, or SDS, has lagged behind developments in software-defined data center and networking domains. This is largely due to the storage process’s tight confinement within the underlying physical system with its controllers and media. Virtualization first helped to loosen that hold as it logically presented available storage to create virtual storage pools.
SDS takes another step beyond that: it can abstract storage provisioning, storage management, and I/O optimization from the physical controllers to a virtualized console for unified administration. Not all SDS products do all things, but IT can choose the feature set that best benefits storage infrastructure: to unify heterogeneous storage systems, or to enable cost-effective scale-out, or to match application needs on-the-fly to their storage targets. Working products are available right now that accomplish one or more of these critical operations.
Software-Defined Storage: What Is It?
Our definition: Software-defined storage (SDS) decouples storage management programming from underlying storage hardware. The resulting control layer may be software or a virtual appliance. This layer sits on top of the physical storage stack, abstracts storage management from the physical storage layer, and converges it into a highly manageable interface. This enables IT to centrally provide application provisioning, policies, data protection, storage pooling, and reporting instead of from array-specific interfaces. Some SDS solutions also identify application I/O patterns and optimize I/O accordingly, targeting it to the optimal storage target. Some create a virtual pool of storage so that data is stored in a manner that is abstracted and free from the mechanics of underlying storage.
SDS makes application provisioning simpler by observing capacity, availability and performance needs. Storage control is usually highly automated: the more automated the layer is the better for dynamically serving applications and workloads – especially ever-changing applications and workloads in the virtual infrastructure.
In the name of easy scalability, some SDS products take a commodity storage/building block approach to scale-out. A word of caution here: no matter how virtualized or abstracted the storage system may be, at its base it remains a physical entity. Just because something is “commodity storage” does not make it an ideal choice in an SDS infrastructure. Performance does not magically increase along with capacity: storage systems are still subject to performance lags due to slow physical controllers and less-than-optimal fabric. How well the SDS layer works depends on how effectively the underlying physical storage works. Some priority applications still require high performance, highly available storage systems at the hardware layer of an SDS infrastructure.
Different Approaches
We identify three major SDS approaches in the market today: orchestration (control layer), encapsulation (virtual storage appliances), and server-side (virtual controllers).
· Orchestration: Control layer. This layer sits on top of physical storage systems from one or more vendors, abstracts storage management tools from the arrays, and delivers them as management tools running from a central console. IT schedules and launches storage management across heterogeneous arrays and can create virtual storage pools from attached systems. The physical data paths remain in place. Solutions are emerging from several different sources: storage virtualization vendors, large storage vendors, and startups. It is a promising approach for adding intelligence to a sprawling infrastructure, but in reality this approach takes a lot of API engineering to centrally manage heterogeneous storage systems. EMC’s ViPR is making a splash and the category also includes central control products that have been around a long time such as IBM SVC and software-only DataCore.
· Encapsulation: Virtual storage appliances (VSAs). VSAs typically sit on top of a physical storage infrastructure and enable various storage operations across the virtual environment. This is a very wide definition, too wide to automatically include VSAs into an SDS framework. A number of them do fit our definition if they abstract and virtualize data management from underlying storage and optimize virtual applications by matching storage availability, performance and capacity to the VMs. VSAs are here now and are a practical enablement of software-defined storage that can add agility to a data center infrastructure. VSAs are probably the largest part of the SDS market in terms of vendor count. A few examples include HP StoreVirtual VSA and VMware vSAN.
· Server-side: Virtual controllers. The above two approaches are common sense evolutions of virtualized storage management and virtual appliances, and make sense in an SDS definition. However, SDS is also spurring new approaches. One of these is a recently emerging architecture that uniquely distributes specific functions for a new approach to both software definition and the physical storage of data. This SDS class offers a software layer comprised of virtualized storage controllers. The virtual controllers are deployed on each virtual or physical application server. A control layer gathers information from the controllers and assigns policies to each controller, allowing IT to assign different processing policies per applications. By putting the virtual controllers at the workload layer, distributed controllers can optimize management for different needs. This class may run on commodity storage but is also well-suited to high performance computing environments.