A partial list of potential benefits includes improved service levels, a reduction in management complexity, and improved utilization of storage assets.
By David Hill
Virtualization is a fashionable word in the IT lexicon these days, but does it have staying power? What is virtualization and why/where/when does it matter?
Before we delve into those issues, it’s important to note that virtualization is not reserved for storage; in the IT world any resources can be virtualized, including servers, networks, operating systems, and applications. This article focuses only on storage virtualization.
Block-based storage virtualization is the technology that most people think of, but it is only one of many types of virtualization. The Storage Networking Industry Association’s virtualization taxonomy breaks down virtualization as follows:
- What is virtualized?-block; disk; tape, tape drive, tape library; file, file system, record; other device virtualization
- Where is it done?-host server, network, storage device
- How is it implemented?-in-band, out-of-band
SNIA also has a shared storage model with a variety of levels, including storage devices (level 1), block aggregation (level 2), file/record layer (level 3), and applications (level 4). The focus of this article is on two types of storage virtualization: block- and file-based virtualization.
(Virtual tape libraries, or VTLs, are another important form of storage virtualization, but are not covered in this article. To learn more about VTLs and virtualization, see “Guidelines for evaluating virtual tape libraries,”InfoStor, May 2006, p. 37.)
Block-based storage virtualization is not new. Xiotech, for example, has used block-based virtualization in its midrange disk arrays from the time it opened its doors in 1998. Three software-focused vendors-DataCore, FalconStor, and StoreAge-were also among the pioneers of block-based storage virtualization. Today, all four of these vendors still leverage block-based storage virtualization as the core of their solutions, but, for the most part, they do not emphasize the technology. Rather, they focus on providing the data management services (such as copy, mirroring, and migration services) that are by-products of virtualization, as well as some of the benefits, such as ease of use and management.
Among the major storage vendors, IBM has the most mature and widely adopted block-based virtualization product in its SAN Volume Controller (SVC). The SVC has been in the market for three years, and IBM claims more than 2,000 customers. The SVC is a storage network virtualization appliance, in contrast to the controller-based virtualization approach of Hitachi Data Systems or the switch-based network virtualization approach of EMC.
Hitachi Data Systems is next in line with its Tagma-Store Universal Storage Platform (USP). Hitachi claims 3,000 units in the field that are capable of virtualization. The operative word is capable, because about 60% of Hitachi’s customers use the USP as a stand-alone array due to its considerable capacity. That means that approximately 40% of the customers use tier 2 and tier 3 storage external to the USP (in addition to internal storage). Hitachi’s newer Network Storage Controller (model NSC55) is a modular, rack-mountable subset of the USP.
EMC was the first major vendor to deliver a pure switch-based virtualization solution, Invista, which became generally available earlier this year. Although Invista will require some time to gain momentum, EMC has a strong relationship with Cisco to use Invista on Cisco’s MDS9000 switches via Storage Services Modules, and EMC works closely with Cisco’s Virtual SAN technology.
In addition, EMC recently acquired Kashya, which is likely to be another example of EMC’s successful program of R&D through acquisition. As a result of the Kashya acquisition, Invista will eventually be able to provide data management services, such as network-based snapshot functionality and remote replication.
Key storage networking vendors such as Brocade, Cisco, and McData support virtualization in a variety of ways. Cisco, for example, works closely with both EMC and IBM. EMC’s Invista and IBM’s SVC can be coupled with Cisco’s MDS9000 switches through service modules.
File virtualization makes files location-independent, helping users deal with the fact that NAS scalability can become an issue because single filers can run out of capacity. File virtualization can be implemented in a variety of ways, but global namespace is one approach that is commonly used.
Global namespace technology pools storage across file systems to create a single logical entity. Files can then be physically migrated between servers without disruption to users. Users still see files as if they were in the same location, but physically the files may have been moved. As a result, unused storage can be reclaimed, new capacity can be added non-disruptively, and users don’t have to worry about disruption to their business activities.
However, a global namespace is not required for file virtualization. ONStor, for example, provides a NAS gateway approach that concentrates users into a single managed pool of storage that also allows for servers to be consolidated.
Who needs it?
So why and where do businesses need storage virtualization? In essence, IT organizations face increasing challenges:
- Service levels-More and more applications run 24×7 with zero tolerance or either unplanned or planned downtime;
- Management complexity- Not only are there too many servers, operating systems, storage systems, and switches, but too many management consoles are required to manage them. This is coupled with budget pressures and staff management issues; and
- Underutilized storage assets-SNIA reports that non-virtualized disk has only a 30% to 50% utilization rate, and tape only has a 20% to 40% utilization rate. CIOs must dread the question from CFOs: “Now tell me again why you want more storage when …?” whenever they go to the CAPEX budget well for more storage dollars.
Matching up with those challenges, the key benefits of storage virtualization are
- Improved service levels-Significant reduction in both planned and unplanned downtime;
- Reduction in management complexity-Simplifies the storage policies and procedures; provides an architecture that is more scalable, flexible, and secure; enables on-demand dynamic provisioning without disruption; improves the delivery and quality of services such as replication and migration; and
- Improved utilization of storage assets-Scarce funds can be deployed to other IT areas
However, despite all the activity in storage virtualization, the technology has not lived up to expectations.
The rate of adoption will vary by type of storage virtualization. File virtualization is still in the very early stages of adoption. Block-based virtualization is further along the adoption curve, but users should not have unrealistic expectations.
A big assumption has been that block-based virtualization could be used across heterogeneous arrays and thereby in effect commoditize the arrays. But a continuing problem is that if Vendor A virtualizes Vendor B’s storage along with its own storage, the “owner” can no longer use data services, such as remote mirroring, that were tied to Vendor B’s storage. The argument from Vendor A that its own data services could be used instead may be unconvincing to IT executives who have already spent time, money, and training in Vendor B’s products.
A second problem is that virtualization means spreading data physically across storage volumes. If volumes for a mission-critical system happened to cross vendor boundaries, that would be a no-no. Why? Because if the data were spread across volumes of both Vendor A and Vendor B and if Vendor B’s RAID-5 array experienced two disk failures before the first disk could be rebuilt, then all the data would be lost. Although the data might eventually be recovered, say from archived tape, the finger pointing would not be fun. Owners of mission-critical applications and processes tend to prefer “one-neck-to-choke” scenarios.
Block-based storage virtualization vendors have, wisely in our view, reset their sights on providing data services, such as migration, with virtualization in the background. The question is how compelling these services are, especially in light of the fact that there may be alternative approaches to accomplishing the same goal.
Will storage virtualization ever be so deeply embedded in the storage infrastructure that it is ubiquitous and no longer talked about? Obviously, the adoption of virtualization will take time, and enterprise users understand that virtualization can be a one-way trip. Once a particular form of virtualization has been adopted, it’s difficult to turn back.
Virtualization sounds like-and is-a big step for any organization, but how many drivers using automatic transmissions really want to go back to using a manual clutch? From our perspective, the potential technical and business benefits of virtualization will eventually lead to widespread adoption, but it will come via a process of erosion, not by a big bang.
David Hill is a principal with the Mesabi Group LLC (www.mesabigroup.com), a consulting firm specializing in storage networking and management. A version of this article originally appeared in the Pund-IT Review newsletter (www.pund-it.com).