There are at least five methods of achieving storage virtualization.
Now that storage area network (SAN) "plumbing" has matured, with a wide array of Fibre Channel products, it's time to turn attention to storage virtualization.
While SAN connections widen the pipes and stretch the distance between disks and hosts, plumbing alone does little to reconcile the conflicts among servers competing for disk space. You can look at storage virtualization products as capacity brokers in this chaotic environment. In their simplest form, they collect all or portions of the SAN's physical disks into a pool, and hand out logical slices to application servers without having to re-cable or re-zone the SAN.
Virtualization provides many benefits, such as the ability to allocate storage resources on demand, integrate storage products from multiple vendors, configure selectively for high availability, and reduce the total cost of ownership.
At least five divergent approaches to sharing virtual disk capacity have emerged, and numerous derivative packages will follow soon. Ranging broadly in price, performance, and utility, these virtualization solutions can be categorized by the methods they use to translate the host's logical view from the physical disks. The methods differ in where the mapping takes place, and what platform is used to deliver the services. The five approaches include:
- Multi-host storage arrays
- Host-based LUN masking filters
- File system redirectors via outboard metadata controllers
- Specialized in-band virtualization engines
- Dedicated storage domain servers
Criteria to consider when assessing the various approaches include:
- The degree of independence from the host operating system and file system
- Supported mix of storage hardware
- Legacy storage asset investment protection
- Robustness of the security policy
- Effectiveness of the technology to minimize losses due to planned and unplanned downtime
- Coverage included under a centralized management view
- Ability to leverage commodity hardware and storage devices for improved performance at reasonable cost
Other factors include reliability, availability, and scalability.
Host independence. Some vendors place virtualization software on all hosts attached to the SAN. This requires those vendors to keep up with operating system revisions across heterogeneous platforms, and may cause headaches for system administrators when hosts are added or updated.
Figure 1: A multi-host array puts the pooling responsibility at the storage subsystem level, usually with RAID controller firmware.
Mixed storage support. Although choosing products from a single vendor can provide near-term comfort, in the long run it may compromise your ability to respond to change. For heterogeneous environments, a better approach is to choose a virtualization technique that works with multiple vendors' host platforms and storage subsystems.
Investment protection. How much of your current disk population is Fibre Channel ready? If your mix includes many SCSI, EIDE, or SSA drives, the SAN virtualization choices get slim. Routers or bridges can be used to connect "legacy" subsystems to a Fibre Channel SAN, but that approach adds costs. Better instead to look for pooling products that have built-in support for your existing interfaces.
Security. Security and host independence are somewhat intertwined. Depending on hosts to implement the security layer for shared-access control on a SAN is misplacing the authority. A rogue host is able to read and write to any disk in the pool, unintentionally corrupting another host's data. Steer towards outboard security implementations that centralize access control. This provides an additional benefit: With the growing importance of personal privacy in the e-commerce world, an outboard security implementation simplifies the auditing of data trails.
Figure 2: With LUN masking, specialized device drivers are installed on each host to prevent that host from accessing storage resources that it doesn't "own."
Resiliency to outages. Buying devices in pairs to protect against failure is not the best way to spend the IT budget. A more practical (and effective) approach is to amortize redundancy across many resources in an N+1 fashion. In other words, when you need five units, buy six, not ten, and you'll have a better combination of availability and cost savings.
Centralization. Some vendors' centralized storage pools and storage management is limited to disks within one box, or one vendor's line of products. To avoid vendor lock-in, you should look for centralized administration of heterogeneous subsystems.
Price-performance. In many cases, the virtualization engine should be outboard, which offloads hosts. Some vendors' products use proprietary hardware and software to provide virtualization and other services. This increases development and testing costs, which the end user ultimately must fund. An alternative is a virtualization technique that leverages existing technologies that are cost-efficient, familiar, easy to upgrade, and extensible. This includes processors, storage devices, and operating systems.
There are a variety of SAN virtualization alternatives, including multi-host arrays, LUN masking, file system redirectors, in-band virtualization engines, and storage domain servers.
A multi-host array puts the pooling responsibility at the storage subsystem level, usually with RAID controller firmware (see Figure 1). This implementation offers good performance, high availability, and connectivity to heterogeneous hosts. One potential drawback to this approach is that the disk pool may be limited to the vendor's disk arrays. Spilling over may require creating multiple pools and losing allocation freedom and centralization. Although some vendors offer centralized management for multiple arrays, they don't necessarily provide multi-vendor support.
One means of enabling storage pooling is to install specialized device drivers on each host to prevent that host from accessing storage resources that it doesn't "own;" in other words, masking their view of which disks they are allowed to see (see Figure 2). These LUN masking drivers are typically configured using a central management application that can be either host-based or outboard. Although this method works well for small, controlled configurations, it introduces complexities and costs in larger SAN configurations.
LUN masking must span a potentially wide spectrum of server platforms. Also, because every host on the SAN has to have a LUN masking driver, there may be a performance penalty. Plus, change management across numerous hosts can be costly. And a "rogue" host without LUN masking software can defeat the security controls of the shared resources and corrupt other disks in the storage pool.
File system redirectors
A third type of pooling technique involves the use of file system redirector software.
Basically, file access control travels over the LAN, but disk data I/O moves over the high-speed SAN. Each host on the SAN requires software to facilitate the mapping of file names to block addresses, all brokered by an external metadata controller or file system manager. These products are often targeted at offloading disk I/O traffic from LANs, rather than general-purpose virtualized storage pooling, although they do include storage virtualization, or abstraction. Like LUN masking software, file system redirection is tied to specific operating environments, and software must be installed on every host. For the best of both worlds, consider overlaying file redirection software on a virtualized storage pooling service.
In-band virtualization engines provide virtualized storage pooling by consolidating storage allocation and security functions on dedicated platforms that sit between the hosts and the physical storage (thus "in-band"). Typically, no additional software is required on the hosts, allowing the engines to support a diverse range of server platforms.
Figure 3: In a storage domain server, the virtualization function is implemented in software that runs as a network storage control layer on top of the platform's native operating system.
Virtualization engines can incorporate a wide range of features. At one end are entry-level products that address simple storage pooling needs, often requiring the purchase of external switches and storage devices. Other approaches embed switching support in the "appliance" bundle. Still others include disks, and appear very similar to multi-host arrays, but potentially at lower price points with greater configuration flexibility.
Beware that there is a war raging between the out-of-band (outside the data path) and the in-band virtualization camps. Some argue that in-band products slow data access, and that the failure of the virtualization platform could compromise availability. However, some vendors use caching and alternate paths to achieve performance and availability benefits.
Storage domain servers
A storage domain server is a commercial server platform dedicated to virtualization and allocation of disk storage to the hosts (see Figure 3). The virtualization function is implemented in software that runs as a network storage control layer on top of the platform's native operating system (usually Windows NT). This allows it to leverage many of the operating system's networking, volume management, device interoperability, and security features. Some storage domain servers distribute the processing load and management chores for a large storage pool, while maintaining centralized administration.
Storage domain servers can add value to the I/O stream by optionally performing host- and storage device-independent caching, load monitoring, and snapshot or remote mirroring services. The result can be a reduction in acquisition, administrative, and upgrade costs. This approach is similar to specialized virtualization engines; in fact, many appliances are storage domain servers with hardware and software add-ons.
Just as network domain servers delivered significant advancement for LANs, storage domain servers promise to deliver the advantages of disk virtualization for SANs.
The recent flood of storage virtualization products presents an abundance of choices, as well as confusion. Ultimately, the best solution will provide freedom of choice and high performance at a reasonable cost.
Augie Gonzalez is director of product marketing at DataCore Software Corp. (www.datacoresoftware.com), in Ft. Lauderdale, FL.