By Brian Peterson
In his keynote address at Storage Networking World in October 2007, Andy Monshaw, IBM's general manager of systems storage, said, "Until , data was predictable [and] you could understand what your storage capacities were going to be in the next year or so." Then storage demands began to spiral upwards because of mushrooming data growth, increased compliance reporting, and complex data demands such as longitudinal data mining.
In the early 2000s, IT managers achieved cost and operational benefits through incremental improvements such as better maintenance contracts and capacity utilization. Since 2005, soaring data growth has made reducing costs more challenging, causing managers to adopt server and desktop virtualization.
The server-storage gap
Server virtualization allows one physical server to many guest virtual machines (VMs); each with its own 20GB operating system requiring its own storage space, backup demands, and storage I/O. Virtual server sprawl ensues when more VMs serve up more applications called by more users for more I/O-intensive processes.
Improved automated provisioning systems make it easy and (seemingly) free to deploy new VMs in minutes. Many companies now have 10 times more operating system (OS) instances than before the virtual server revolution. Gartner Group's "Key Issues for Enterprise Storage, 2009" paper notes that, since "server virtualization mobility tools require shared storage to work, many enterprises are deploying SAN and NAS for the first time, … [so that] storage adapters are suddenly becoming bottlenecks to application performance, and traditional methods for backup are especially problematic." Other results include:
- Heavy consumption of storage network space
- Stress on legacy storage networks
- An avalanche of backup data
- Frequent demands to add storage
- Increasing complexity of storage architecture, storage management, and data replication
Server virtualization drives us to shift operating system data that was on low-cost, captive disk drives inside each server into a SAN (or NAS) environment, which is usually much more expensive per GB of storage. Storing VMs can be—and for many already is—an overwhelming cost. Since the life-cycle cost to maintain storage is roughly seven times its purchase price, it's essential to rein in storage bloat by adopting new techniques to optimize storage in a virtual server environment.
Rebalancing the equation
Properly managed, SAN (or NAS) storage configurations in virtual server environments can lower the cost of storing data, improve performance, or simplify operations. As usual, you can optimize two of the three sides of the classic cheaper-better-faster triangle (see diagram), but not all three, and there is no single "one-size-fits-all" storage suit.
This article compares six techniques for VM storage management and examines which approach—or combination of approaches—works best for various data management challenges.
A virtualized server environment requires server, network, and storage teams to work closely together. These functions may tend to be somewhat siloed, but to build a balanced server-storage strategy, each group must understand the priorities, processes, and requirements of the others. Integrating the planning of storage with virtual server needs enables organizations to support growth, cost control, and performance requirements more effectively.
Each of the following storage options conserves storage, but none is perfect:
- Deduplicate VM OS images. Deduplicating nearly identical OS images vastly reduces space requirements.
- Implement tiered storage for VMs. Assign different data types to the proper storage tier according to the relative volatility, priority, or frequency of access.
- Consolidate the SAN. Streamline SAN storage by several methods described below.
- Implement NAS. Simplify administration by using NFS for VM storage and use deduplication.
- Provision from snapshots. By deploying VMs as space-optimized snapshots of a full-sized "gold copy" OS image, many VMs can be stored in the space of a few full OS images.
- Deduplicate backups. Backup deduplication provides special benefits in a virtual server environment.
Deduplicate OS images
In a virtualized environment, 80-90% of each OS image is identical. For example, the binaries for every user's Windows splash screen are constant, and only system information such as host names and registry data vary. Besides, the OS images do not greatly affect disk I/O performance. Some vendors now offer deduplication for primary storage so that groups of OS images can be effectively reduced by 70% or more with no significant performance impact. And one major storage vendor guarantees a minimum storage space reduction of 50% from OS deduplication.
Deduplicating VM storage is becoming a widespread practice. Example: A large insurance company with more than 400TB of primary storage had a rapidly growing farm of 200 virtual servers, each with a 20GB or larger OS image. This company expects to save at least $750,000 over the next three years by deduping the VM OS images, flattening its SAN, and deduplicating VM backups.
Use storage tiers
Server managers should tell storage teams not only how much new storage they need, but also what type of data will reside on it. Storage teams can then assign segregated storage pools so that more static, less accessed data (e.g., OS images) stays on slower Tier-3 storage (e.g., SATA disks), which is often 5-7 times less costly than Tier-1 storage. More frequently accessed data, such as applications and their binaries, may reside on Tier-2 storage. Fast, expensive Tier-1 storage should be reserved for the most volatile page files and databases.
Note that some virtualization components, such as VMware Site Recovery Manager (SRM) currently require that all data needed to recover a single VM be in a single ESX storage pool. Array-based tiering tools (e.g., from vendors such as Compellent, EMC and Hitachi Data Systems)
Example: A financial services company with more than 300TB of storage and a sprawling VM environment puts operating systems on inexpensive Tier-3 storage, while allocating more costly Tier-2 and Tier-1 resources to heavily used, more volatile application and user data. More than 70% of its total storage moves to Tier-3, which costs 75% less than the Tier-1 storage previously used for all VM data.
Consolidate storage networks
Large Fibre Channel SANs once supported hundreds or thousands of physical servers and grew to 12 or more switches per fabric. Now, VM servers consolidate guests at ratios of 10:1, 15:1 or even 20:1, requiring wider, faster data pipes and fewer ports to create flatter, faster SANs. Simpler two-director SAN architectures of the late 1990s can again be used, even in midsized enterprises. Performance is enhanced as a SAN can now support bandwidths up to 8Gbps.
Some virtualization management products (e.g., VMware ESX 3.x) do not support storage multi-path load balancing, thus supporting only one I/O channel at a time. VMware ESX 4.x now allows third-party multi-pathing software (e.g., EMC's PowerPath VE) to use multiple I/O paths simultaneously. This reduces either the number of ports or per-port bandwidth by 50% or more — and may cut costs by half. The high-performance ports may cost more, but fewer are required.
Another option, N_Port virtualization (NPIV), allows a single HBA to support multiple virtual host bus adapters (vHBAs). Each guest OS can have a unique ID in the SAN, enhancing reliability and security. Storage administrators can use existing tools to monitor SAN processes from virtual servers to the storage array, which helps with tasks such as troubleshooting spindle contention, I/O mapping, and capacity planning.
Instead of using FC- or iSCSI-based SANs with VMware, virtualized storage can be presented via NFS on a NAS appliance. The NAS device presents a ready-to-use file system to a virtual server, eliminating the need for server-based file systems such as VMFS. Significant advantages of NAS over a SAN offset criticisms that it is less secure and slower. If properly deployed using isolated VLANs and multiple Gigabit Ethernet links or faster 10Gbps Ethernet connections, NAS can be at least as fast and secure as a more complicated and costly Fibre Channel SAN.
NAS advantages include:
- Deduplication of VM server OSes, included with some vendors' NAS appliances (e.g., EMC's Celerra and NetApp's FAS) significantly reduces required storage.
- Existing low-cost 1Gbps and 10Gbps Ethernet can be used instead of complex and more costly Fibre Channel SANs.
- NAS is easier to manage. For example, storage administrators can expand NAS storage presented to the virtual servers without involving the host system, whereas expansion of a SAN requires expanding a LUN or creating and presenting more LUNs to the server, then extending the file system, and other administrative host changes.
NAS deployments require several best practices and cautions:
- Do not route NAS over long distances, as latency may become significant.
- Use NFS v3 over UDP and jumbo frames for maximum performance.
- Consider using an isolated vLAN for security and traffic isolation.
- VMware SRM provides limited support for NAS. Check your versions.
Example: A healthcare organization with more than 200TB of primary Fibre Channel storage deploys VMware on NAS to simplify administration and natively dedupe the operating systems. The organization significantly reduces SAN connectivity costs for virtual servers, simplify ongoing administration, and reduces the storage capacities required to support VMs.
Provision from snapshots
New VMs can be rapidly and economically provisioned using array-based snapshot OS images by presenting a writable, space-optimized snapshot of a full-sized "gold OS image" to the VM server. Organizations can quickly respond to new VM needs while conserving large quantities of disk space. Updates to the snapshot-based images are written to a separate location, often called a "save-vol" or "snap reserve." Each OS image can be updated individually and retains its own discrete personality. Many condensed guest OS snapshots will fit in the space required for just a few full-volume copies.
Important best practices include:
- Store page files on Tier-1 or Tier-2 storage, and not on the snapshot-based c: drive, because page files change constantly, are frequently used, and consume a lot of variable space.
- Snapshots require a unique raw volume or a separate NFS volume for each VM guest.
- OS patches to the gold image are not automatically propagated to the copies. Possible solutions include:
- Apply patches separately to each snapshot-based OS instance. This is easy, but less space efficient as it enlarges the save-vol space required.
- Patch the gold image and redeploy all the OS instances with new snapshots. Then restore the system state information (registry, program files, host name, etc.) from backup. This is far more difficult than the first option, but guarantees standard and secure OSes and controls storage growth due to patches.
Snapshot deployment of OS images yields large financial benefits. For example, storing 100 full-sized copies of a typical 20GB OS image requires 2TB of storage. If the cost were $20/GB, the total acquisition cost would be around $40,000. Using snapshot images can compress storage requirements by 20x. The same 100 copies deployed via snapshots use may consume 100GB of disk and cost around $2,000 to acquire.
Example: A large publishing company with 300TB of primary storage leverages the speed, flexibility, and space and cost economies of array snapshots to provision OS images for offshore content producers and developers. Because its workforce expands and shrinks with publishing volume, on-the-fly OS deployments enable them to provision VDI desktops quickly and flexibly.
Deduplicate backup storage
The highly duplicated data in VM images means image deduplication can reduce backup storage by as much as 95%. Deduplication of backup data is logical and offers excellent ROI with minimal consequences. Two methods work well in VM environments:
- Host-based backup deduplication (e.g., from EMC Avamar or Symantec's PureDisk) replaces existing backup software. Deduping on the host consumes less network bandwidth.
- Target-based deduplication (e.g., from Diligent Technologies, EMC's Data Domain unit, NetApp, and Quantum) is easy to adopt and works well with existing backup software, but does not reduce network bandwidth loads.
Although standard tape backup is less expensive than disk-based or VTL backup, as the deduplication ratio approaches 100:1 the purchase cost of deduplicated disk approaches that of automated tape for equal capacity. In a hypothetical example of backing up 100TB of VM data with LTO-4 tape at $200/TB, the purchase cost is about $20,000. If we assume deduplicated disk costs about $10,000/TB before compression, at a 25:1 compression ratio the purchase cost is $100,000—far more than tape. At a 100:1 ratio, the disk purchase cost goes down to $25,000—almost the same price as tape. This makes a strong business case for deduplicated disk storage.
There are other major benefits over tape:
- Enabling cost-effective replication of backups for disaster recovery
- Environmental savings in power, space and cooling
- Bandwidth savings for remote replication
- Management savings, compared to the problems of tape management
- Significant business risk reduction through SLA and recovery speeds superior to tape backup.
Backup deduplication is very popular. Example: A large regional banking organization has more than 200TB of primary disk storage and supported 300-400 VM servers. Using source-based deduplication of VM data for backups the organization has almost eliminated legacy backup storage systems, software and maintenance, while improving local and disaster recovery capabilities.
BRIAN PETERSON is a storage architect with Forsythe Technology, www.forsythe.com, specializing in storage strategy and cost optimization. Peterson has held positions on both the supplier and customer sides of IT.
Weighing goals and options
First, define what challenges you are trying to overcome through storage support for server virtualization. How far does your organization want to carry server virtualization? How well prepared is it for such an initiative, both in business terms and technical terms? Are your server, storage, and network teams committed to developing balanced solutions?
Next, determine the organization's needs, goals, and priorities. Priorities can be very different for a production environment, a test and development environment, or business-critical applications. Regulatory constraints and the competitive environment are also critical.
In a production environment, if performance is the top priority, consolidation of a high-powered SAN may be a good solution, but a robust 10Gbps NAS architecture with simpler management, or judicious use of tiered storage, can be excellent alternatives.
For operational ease and efficiency, consider techniques such as deduplication of primary storage, snapshot OS deployment, and backup deduplication. If no long-distance routing is involved, NAS can provide a powerful yet easily managed solution.
When cost reduction is the primary goal, deduplication, selective storage tiering, and snapshot OS deployments can be a good combination. NAS offers more modest cost.
In test environments, snapshots offer easy set-up and reconfiguration, and deduplication of primary storage can reduce development cost.
For critical applications, where the priorities are reliability, robustness, and speed of disaster recovery, a SAN may be a good choice, while deduplication of OS and disk-based backup storage offer clear advantages.
The overall goal should be to meet surging data growth with effective storage management and conservation methods that are in balance with your VM server and VDI environments.