A number of approaches exist for virtualizing heterogeneous storage environments, including SAN appliances, "SAN-in-a-box," and distributed enterprise solutions.
BY RICHARD R. LEE AND HARRIETT L. BENNETT
Last year, a number of vendors, including start-ups and well-established companies, announced various types of storage virtualization schemes that have been subsequently embodied in an appliance, SAN-in-a-box, or distributed enterprise solution. Each of the vendors claims to have conquered virtualization-the critical enabler for delivering on the promises of storage area networks (SANs). The nine offerings evaluated in this article present virtualization solutions that are available today, with hefty enhancements planned for the near future.
Storage virtualization, or disk pooling, is garnering significant interest in the IT community. Many believe it is a new technological concept facilitated by SANs, while others know it is an effective tool from the mainframe world that has now found its way into open systems. The concept of pooling storage has been supported in mainframe environments for more than 10 years. Much of this was based on IBM's concept of "System Managed Storage," embodied in MVS in the early 1990s.
This article attempts to shed some objective light on storage virtualization by profiling a cross sampling of vendors that are supporting disk pooling with products based on hardware, software, or a combination of the two to provide virtualization.
Defining the benefits
Storage virtualization is such a hot topic these days because it is the critical enabling technology that allows SANs to deliver on the promises of decreased total cost of ownership (TCO), increased scalability and availability, and reductions in complexity, downtime, and storage acquisition costs. Simply stated, virtualization is the "magic bullet" that every CIO, IT manager, and network administrator has been looking for to solve his organization's storage problems on a local and enterprise-wide basis.
How we get there
The first step toward storage virtualization is to combine physical storage assets into virtual pools that can be transparently shared among various hosts. These hosts will no longer interact with captive physical and logical storage devices but, rather, will see storage as virtual disk pools, each with the potential of delivering varying Qualities of Service (QoS) to meet the demands of users and applications on the SAN.
Once this basic disk pool is enabled, another layer of virtualization will evolve relating to how data is presented to applications in a "virtualized" way. This next phase is paramount if the industry is to realize the grand scheme of a storage utility, where information is virtualized at the file level in the form of a universal file system or SAN operating system and presented as a "utility" to every host across the enterprise.
There are a number of architectural constructs being evangelized by the vendor community. The following section briefly discusses the technical approaches and the advantages/disadvantages that each technique represents.
A SAN appliance is designed to connect to the SAN fabric and aggregate storage devices such that all of the hosts on the SAN see storage as virtual pools. An appliance can come in several flavors. One type of implementation is a software-oriented solution that runs on an industry-standard platform and serves as a storage domain server with data intercepted between storage devices and hosts. A second type is an appliance that plugs into the fabric and has separate paths for data and control.
The appliance functionality can be implemented on a router, a standard server, or a dedicated storage management controller. In some cases, an appliance can also be used as an engine in a SAN-in-a-box. These solutions vary in heterogeneity depending on the storage devices, hosts, and network protocols they support. Many are able to reuse legacy storage assets. Varying levels of fail-over and clustering can also be attributes of a SAN appliance.
A SAN-in-a-box is an integrated solution of storage management services, storage subsystems, and switching. Usually tightly coupled to exploit disk and switching performance, these virtualization solutions are designed to plug into a storage network as a self-contained SAN. Scalability comes in terms of being able to expand ports and storage within the box, as well as multi-node configurations for storage expansion, load balancing, and redundancy.
A SAN-in-a-box can be relatively simple to manage because the disk subsystems are generally of a single type, thus relieving the administration of multiple types of devices. One potential tradeoff versus a more software-oriented approach is that you get the most out of the SAN-in-a-box approach by using specific hardware, which may limit your choices. To bridge the gap, vendors in this space provide software that aggregates SAN-ready storage from multiple vendors. Like in-band SAN appliances, a SAN-in-a-box uses caching and queuing to increase data access speed as it services requests.
The distributed enterprise solution seeks to virtualize storage at both the block and file levels. The intent is to build on a block-level volume management solution, in forms like those above, by closely layering a SAN file system on top of it. Not only will files become more universally accessible, but the data path will eventually become consolidated by eliminating the data flow through multiple servers (e.g., application, database, and Web). The solutions available today involve server software for volume and file management as well as agent software for each host. By necessity, distributed enterprise solutions must maintain a high degree of openness to cope with all the resources of the SAN. Likewise, SAN entities that will interface with the file system and other enterprise-wide standards need to include the appropriate application programming interfaces (APIs).
There is a common divider among the various virtualization schemes that has been the subject of much discussion. It is fashionable to segregate solutions according to whether they are in-band or out-of-band (or symmetric versus asymmetric). The real debate is over the merits of in-band versus out-of-band virtualization at the block level, given that a distributed enterprise file system requires metadata to be processed at the host level and that metadata flows across a separate path. It is conceivable that the best of both worlds will be possible, depending on each storage environment's unique requirements and how they relate to the core business of the company.
Weighing the options
The nine vendors surveyed for this article share similar visions of virtualization becoming ubiquitous, and ultimately providing the ability to deliver storage as a utility. Many of the approaches differ substantially in architecture, all with relative advantages and drawbacks.
Compaq's VersaStor provides storage abstraction and pooling for any online storage system connected to a SAN, regardless of manufacturer. Currently shipping as SAN management appliances for Windows NT/2000 platforms, versions with agents developed by partners are expected in the second half of this year. The VersaStor architecture deploys SAN-wide asymmetrical virtualization at the block level and uses host bus adapter (HBA)-based agents for caching and mapping tables. The attribute-driven appliance dynamically manages and uploads tables to the servers. Software zoning is accomplished with filter drivers between the host operating system and storage, with up to 1TB per virtual disk supported.
With this out-of-band approach and Compaq's Virtual Replicator technology, VersaStor is able to facilitate space-efficient snapshots and present them selectively to servers without making copies. Virtual Replicator also can replace tape and enable direct recovery from disk. Compaq is targeting VersaStor at the enterprise level and has a comprehensive endorsement program underway, as well as an interoperability alliance with IBM.www.compaq.com/storage.
DataCore Software's virtualization offering, SANsymphony, is purely software. Partners, including Gadzoox, Raidion, and NaviStor, deliver "SANsymphony-powered" SAN appliances or SANs in boxes. SANsymphony is implemented with a superset of advanced controller functions that span dissimilar disk controllers and arrays on the SAN. With disk and array independence and support for multiple network interfaces, SANsymphony can enable a relatively easy migration from server-attached storage to SANs. De-signed to run on a standard NT platform as a storage domain server, DataCore's in-band scheme can accelerate I/O streams through caching and heighten security in terms of hosts and HBAs being unaware of the virtualization technique.
Multiple SANsymphony nodes can run asynchronously and be scaled for capacity and throughput as well as N+1 redundancy. DataCore's road map continues the software development toward management by QoS requirements from mixed workloads and policy-based management of storage assets. www.datacoresoftware.com.
DataDirect SAN DataDirector
DataDirect Networks has been shipping appliances and a SAN-in-a-box for more than a year and is now honed in on the high-growth rich media market. As the first SAN appliance built for rich media, SAN DataDirector addresses the special demands that streaming media puts on storage systems. Each DataDirector can support up to 80,000 simultaneous data streams and 72TB. DataDirector is available to end users and OEMs primarily for rich media content management and other "new-media" applications.
SAN DataDirector uses in-band processing for block-level virtualization. DataDirect's position is that network-attached storage (NAS) is complementary to its appliances, and the company may include out-of-band management in the future. Storage networks can scale through the addition of internal or external disk subsystems and Fibre Channel devices. Legacy network and storage assets can be reused, and SAN DataDirector supports a heterogeneous server environment. www.datadirectnetworks.com.
IBM/Tivoli Storage Tank
IBM/Tivoli's foray into distributed enterprise storage virtualization is a combination of file and block virtualization techniques. Storage Tank enables file-level and data-level locking, along with policy-based management. Block-level pieces are managed as storage groups, and Tivoli provides file-level virtualization to form the Tivoli Storage Tank.
Storage Tank presents block-level virtualization as storage groups, which administrators can classify according to performance, usage, or by business entity. Virtualization at the file level is where data is presented to applications. Tivoli's shared file system is decoupled from block management and can be administered according to policies that are aware of storage group attributes. Both levels of virtualization are considered out-of-band and require software agents for the SAN-attached hosts.
APIs that provide virtualization awareness are at the forefront of IBM's technology strategy. With investment protection as a high priority, interoperability is a big part of Storage Tank endeavors. IBM is close to finalizing Storage Tank's strategy and partners, with a road map announcement expected within the next couple months. www.ibm.com/storage.
StorageApps provides both storage applications and appliances. At the core of the company's appliance is SAN.OS, an operating system that enables in-band virtualization with host and storage device independence. The SANSuite portfolio also includes security, data replication, and point-in-time image software. SANMaster is StorageApps' application for Web-based device and topology management. SAN.OS and the rest of the storage management software are available in OEM products such as Dell's PowerVault 530F and StorageApps' SANLink appliance.
SANLink is a SAN-in-a-box capable of bringing storage outside of the box into the virtualization scheme. Its tight integration allows for performance tuning and high availability configurations with n-way peers. The SANLink Plus appliance permits scaling of NAS across multiple vendors' storage and unifies all NAS/SAN management into one centralized management console. www.storageapps.com.
StoreAge SAN Volume Manager
StoreAge Networking Technologies' SAN Volume Manager (SVM) is a standalone SAN appliance that connects directly to a Fibre Channel fabric. SVM acts as a SAN "metaserver" that separates access control traffic on a LAN from storage data traffic on a SAN, allowing data to travel between servers and storage with minimal overhead. This asymmetric approach allows transfers at full fabric capability between Fibre Channel storage and servers with standard Fibre Channel HBAs. Volume drivers in servers with multiple HBAs provide fail-over and load sharing and present virtual volumes to the operating system as communicated to them by the SVM appliance. Fabric zoning control protects data from access via unauthorized paths.
SVM's volume masking allows storage to be shared among servers with different operating systems and file systems. Volumes can be moved from host to host without copying. Standard Fibre Channel RAID subsystems, tape drives, and libraries are supported by SVM, as are heterogeneous hosts. A multiplicity of RAID subsystems can be pooled and managed through a Web-based GUI, and large SAN configurations are possible through interconnection of SVMs. www.store-age.com.
Veritas' SANPoint Storage Appliance is an integrated set of storage management software products, including Veritas Volume Manager, File System, and Clus-ter Server. The company's initiative for distributed enterprise storage management also includes a file system in the form of Veritas File Server Edition. SANPoint Storage Appliance is a block-level server, with Volume Manager doing the mapping in cooperation with client agents. SAN and NAS servers can be cascaded with SANPoint Storage Appliance on one server and Veritas File Server Edition on a second server, and in a later release, can be combined in one server.
Future releases are also planned for remote site data replication, backup enhancements, closer integration of Veritas SANPoint Control discovery software, and extended offerings for high-availability configurations. Currently, SANPoint Storage Appliance is available only for SPARC Solaris systems, with Intel/NT platform support planned for the near future. www.veritas.com.
Vicom SAN Virtualization Engine:
Vicom's SAN Virtualization Engine (SVE) supports storage pools that can scale up to 500TB for disk-level virtualization of Fibre Channel, SCSI, or SSA storage for UNIX and NT servers. With roots in the mainframe space, Vicom has extended its expertise to storage virtualization in open systems environments with its fabric-based SVE appliance. SVE is built on Vicom's SV Router, which resides between heterogeneous servers and storage devices and provides an abstraction layer in conjunction with micro drivers on the hosts. The SV Router provides data and command routing paths among servers, storage subsystems, and network devices.
The SVE also includes software with a Web-based interface for centralized monitoring, configuration management, copy services, and zone management. Network agnostic, the SVE is used in combination with switches and hubs to expand the capability of the fabric. Scalability is accomplished by adding routers to increase bandwidth and connectivity. Operation of the SV Router can be customized with UNIX scripts and through an open API. www.vicom.com.
XIOtech, a Seagate company, offers Mag-nitude, a SAN-in-a-box with integrated virtualization software, controller, and disk arrays. Magnitude includes all control and storage subsystems and plugs directly into a Fibre Channel fabric. The complexities of storage device management are transparent to administrators. As an advantage in reducing planned downtime, there are no physical views to manage except external switches. In this highly integrated in-band solution, performance can be closely controlled because the SAN management controller can take full advantage of all actuators and use the disk drives' intelligence to overcome SCSI firewall limitations. Being so closely in tune with performance can eliminate hot spots, especially in database and messaging applications, and also maximizes the use of all the available disk capacity.
Magnitude's suite of storage management software includes feature sets for disaster recovery, server-less and/or LAN-free backup, and clustered high-availability configurations. Future releases of the software will enable the aggregation of disks outside of the Magnitude array as well as APIs for high-performance storage applications. www.xiotech.com.
In reviewing all of the vendors' products that are available to date as well as those scheduled for future release, one common thread emerges: Storage virtualization is the facilitator for delivering on the "Promises of SANs."
Without virtualization, every SAN solution offered today is not much more than a difficult-to-manage, high-speed storage fabric, with limited capabilities in regard to meeting either localized or enterprise-wide computing requirements. This fact is not lost on the vendors, who are scrambling to announce products and position themselves to dominate the virtualization arena. The criticality of storage virtualization should not be lost on the end-user community either, because virtualization is the Holy Grail that brings drastically reduced storage costs and complexity.
Given the significance that storage virtualization plays in both the vendor and end-user communities, it is extremely important that some objectivity be brought to the process of reviewing and evaluating the various architectures used to deploy virtualization schemes. There are currently three basic constructs that we have used to categorize vendors' products:
- SAN appliance
- Distributed enterprise solution
Each architecture has merits and limitations. Choosing one versus another requires a thorough understanding of the specific requirements that each SAN must meet. These requirements should be driven by the business processes being supported (e.g., application types, mission criticality, etc.), along with the technical environment being supported.
For example, if your requirements are for a disk pool to support a workgroup or localized application, choosing a SAN appliance or SAN-in-a-box is a very viable option. Either approach will meet these requirements and deliver on the SAN promises that this type of environment needs (e.g., storage on-the-fly, reduced administration, and off-network backup). The SAN-in-a-box approach may scale better beyond workgroups because it is based on built-in scalability that facilitates extensions beyond a single box.
However, if your requirements are enterprise-wide in scope (e.g., anything-to-anywhere connectivity, off-host security and access control, and business continuance), then the choices are vectored toward either an extensible SAN-in-a-box or a distributed enterprise solution.
In essence, the physical scale of your SAN should have a major influence on which virtualization architecture you select, especially if you anticipate major expansions over time. Asking questions of this type will help to determine which solution is optimum for your environment, independent of the conflicting "mine-is-better-than-yours" rhetoric being fostered by the vendor community.
The role of storage virtualization cannot be understated. It is the key missing element and critical enabler to the entire SAN paradigm. Its significance transcends the issues of which SAN infrastructure to use, device interoperability, or which type of GUI will dominate the management schemes utilized. Storage virtualization provides the mechanism to deliver on SAN promises and to support new levels of performance at lower costs.
Richard Lee is president and CEO, and Harriett Bennett is a senior analyst, at Data Storage Technologies Inc. (www.sanwarriors.com) in Ridgewood, NJ.