Part I: Storage virtualization in 'non-SAN' enviroments

Although virtualization is often associated with storage area networks, its benefits are not new and can be realized in "non-SAN" environments.


Storage virtualization is yet another example of the old becoming new. Although the concept is finding new life and importance with the emergence of storage area networks (SANs), storage virtualization technologies are widely used in both mainframe and open systems environments as a way of simplifying administration and providing flexibility in demanding storage environments. In all the excitement about

SAN virtualization, it's easy to lose sight of the benefits of storage virtualization in the prevailing, direct-attached storage architecture.

Despite the hype and new product introductions, SANs have not yet been widely adopted in corporate IT. The Enterprise Storage Group estimates that in large U.S. companies, only about 5% of storage is in a Fibre Channel SAN, with the remaining 95% as direct-attached storage. If you're handling that 95% today, storage virtualization can offer immediate relief in terms of storage manageability and flexibility.

Defining storage virtualization

Despite the recent surge in interest, storage virtualization is not new, either in concept or in practice. Defined almost 20 years ago in a technical white paper created by an IBM mainframe user's group, the concept of storage virtualization was put into practice in the MVS operating system so that strings of DASD could be addressed as simpler, logical entities.

Storage virtualization is no more than the process of taking multiple physical storage devices and combining them into logical (virtual) storage devices or units that are presented to the operating system, applications, and users. In a sense, storage virtualization builds a layer of abstraction above the physical storage (see diagram).

Virtualization combines multiple physical storage devices into logical (virtual) storage devices or units.
Click here to enlarge image

The important part is what can happen in this abstraction process. Because data is not tied to specific hardware devices, virtualization provides a very flexible storage environment. It simplifies the management of storage and can potentially reduce costs through better hardware utilization and consolidation.

The virtual devices are not restricted by the capacity, speed, or reliability limitations of the physical devices. By applying intelligent storage software in the virtualization layer, virtualization offers a way to address the functional challenges of storage.

Users are not generally interested in the physical aspects of the storage serving their applications. They don't want to hear about seek times or rotational latency. They don't care how many disks are in a string or the mean time between failure for those disks.

What they do care about are issues of application response time and throughput, sufficient capacity for their data as it grows, and application downtime. In short, they care about the application aspects of their data, not the physical aspects of storage.

The virtualization layer offers a chance to combine physical devices into virtual entities that meet application requirements. For example, you can create a device that optimizes performance for a specific application, while shielding users and applications from the physical details of the implementation. As new hardware becomes available or application characteristics change, you can modify the physical layer without interrupting access to the logical device.

Two other terms that are often in the storage virtualization mix are "provisioning" and "consolidation."

  • Provisioning is the act of providing users or applications with the right amount and right type of storage, at the right time. Virtualization can make provisioning much simpler.
  • Consolidation is the act of combining storage resources into a virtual pool of storage accessible to many applications or, in clustered or SAN environments, to many servers. One of the benefits of storage virtualization, particularly in a SAN environment, is that it enables consolidation, which simplifies management and possibly reduces the total amount of storage managed.

Managing critical data in an open systems environment is not a simple task, and the challenges are part of what is driving end users toward SAN architectures. This environment is characterized by

  • Numerous heterogeneous systems, which may include multiple flavors of Unix and NT servers;
  • Heterogeneous storage devices, ranging from JBOD arrays to controller-based RAID devices, some of which have proprietary applications;
  • Escalating demands for application availability; and
  • Ever-increasing demands for capacity.

Some organizations create Service Level Agreements (SLAs) with their constituents, promising specific levels of availability and performance. Administrators, often with their jobs on the line, have to operate conservatively. They would rather over-provision than under-provision storage for a critical application. This leads to an underutilization of storage. According to Forrester Research, many organizations use only about 50% of their disk space.

At the same time, they may buy proprietary storage systems that promise certain levels of availability or performance and then find themselves with additional management tasks for those proprietary storage systems. All of this can lead to a complex administrative environment.

In an open systems environment, logical volume managers virtualize storage by consolidating physical storage into logical volumes, which are available to applications or file systems. Logical volume managers are available on most Unix platforms (including Linux) and on Windows 2000.

Logical volume management software can make it much easier for system and storage administrators to address the key issues of open systems storage: performance, availability, and capacity. By assembling virtual storage volumes out of numerous physical devices, you can create storage configurations tuned for specific applications, without the limitations of specific hardware devices. Instead, you can make use of whatever storage is at hand, without locking into proprietary storage solutions.

Logical volume managers improve application availability by building redundancy into the logical volume itself. The possibilities go beyond simply mirroring and RAID. For example, the failure of a device in a redundant storage configuration can degrade performance and expose the data to risk from another failure. The logical volume manager can maintain a pool of spare disks that can be automatically swapped in (hot relocation) when a device fails in a logical volume. It can even handle moving things back automatically when the failed device is replaced or repaired.

Because the virtualization layer (the volume manager) maps from logical to physical on-the-fly, it can similarly reconfigure storage on-the-fly. This means that the administrator can swap out or add storage while applications remain available, removing a common source of "planned" administrative downtime. In a sense, a logical volume manager gives administrators the ability to make adjustments and enhancements to the underlying storage while the data remains available.

Unlike a physical device, a logical volume has a nearly limitless capacity: Administrators can add storage as needed without interrupting access to the data. When used with a database's auto-extend capabilities or a file system's automatic extension, a logical volume manager can significantly ease the problem of provisioning storage for growing applications.

Beyond the single system

Although we're discussing direct-attached storage, the potential benefits of storage virtualization can easily extend beyond any single system's needs. Because individual administrators typically manage multiple systems, administration itself can become a bottleneck in terms of storage scalability.

The cost of managing storage is non-trivial, and over time, exceeds the purchase price of the storage hardware. One way to monitor this cost is to evaluate the amount of storage managed per administrator. Efficient organizations may have one administrator per terabyte of storage or better, while others with more difficult environments may have much lower numbers. Storage virtualization can simplify the management of storage in heterogeneous environments, ultimately reducing the true cost of the storage.

Storage virtualization through logical volume management is simply software. Although it needs to cooperate closely with the host operating system, there is no reason why it should be tied to any specific operating system platform or storage device. A logical volume manager should have cross-platform capabilities and be hardware-agnostic, so organizations can support heterogeneous environments and mix and match storage as needed.

A cross-platform storage virtualization solution offers distinct advantages for administrators or organizations handling multiple operating systems, including

  • A single interface for managing storage across platforms and devices. This reduces specialized training for operating sysem or storage hardware-specific tools, and makes it easier for administrators to support multiple, diverse platforms;
  • Centralized management of distributed storage resources; and
  • Online administration. Because the data is mapped to physical devices dynamically, administrators can add, reconfigure, or even reduce storage while the data itself remains available.

Armed with these tools, organizations can create consistent business methodologies for managing application storage, regardless of platform. For example, companies using virtualization for database storage can define corporate-level policies for the logical volumes and file systems that contain database data and components, mandating certain levels of redundancy and performance.

At the same time, the separation of the logical and physical provides flexibility to change or reconfigure storage hardware transparently, perhaps swapping in higher-performing devices or moving storage to the systems that most need it, without interrupting access to the data.

Storage virtualization, even in the direct-attached, open systems environment, helps IT organizations achieve the SLAs they commit to, while simplifying system management and supporting growing and heterogeneous systems.

This kind of virtualization can be extended to shared-storage environments, such as

  • Availability clusters that share a connection to the same logical volumes, but "roll over" ownership when a primary server fails;
  • Shared data clusters in which multiple servers share access to data residing on a single virtual volume. Web servers accessing a common copy of a site would be one good example of this kind of cluster. Clusters can share data either through simple switched SCSI devices (for smaller clusters) or through a SAN;
  • NAS devices hosting file systems that may be shared among multiple servers; and
  • SANs.

The next article in this series expands the discussion of virtualization to consider non-disk storage as well as some possibilities for SANs.

John Maxwell is vice president of product marketing at Veritas Software (www.veritas.com) in Mountain View, CA.

This is the first article in a three-part series on storage virtualization. This article defines the concept of virtualization and then describes how it is commonly used in open systems, non-SAN environments today. The second article will look beyond these basics, extending the virtualization discussion to include non-disk storage and issues of shared storage using network-attached storage (NAS) or storage area networks. The final article will describe the differences between in-band and out-of-band virtualization.

This article was originally published on September 01, 2001