Lab Review: Building virtual machines on real SANs

Posted on May 04, 2007

RssImageAltText

Multiple Linux virtual machines provide an easy way to raise the utilization rate of 4Gbps Fibre Channel HBAs.

By Jack Fegreus

—As IT resources continue to expand, the rate of utilization of these resources has remained alarmingly low. Some assessments put IT resource utilization in a range of only 10% to 20%. As a result, IT is focusing attention on the issue of resource virtualization.

Through resource virtualization, IT can isolate and manage logical devices that are not constrained by physical configuration or location. Logical devices can be created through either a process of aggregation—the essence of a SAN—or deconstruction—as in server virtualization. Free from physical constraints, logical resources are more easily managed and can be leveraged to their full logical capabilities to provide greater benefits than the original physical configurations.

Those easy-to-manage logical devices also contribute to increasing the utilization of physical resources. SAN virtualization aggregates all storage resources into centralized pools that are accessible to all systems. And server virtualization enables a single server to appear as a group of independent systems that are logically isolated from one another.

That independence and isolation make it possible to run a workload that can meet well-defined service level agreements (SLAs) within a logical machine, without regard to other virtual machines running on the same system. The key to deriving maximum benefits from that environment is the availability of sufficient bandwidth to handle the processor and I/O workloads of multiple virtual machines.

The ability to run multiple virtual machines on a server provides the mechanism for significantly higher resource utilization. Through virtualization, resource utilization can rise to as high as 80%. As a result, the performance and efficiency of server resources is extremely important, especially for I/O devices. Without a scalable hot bus adapter (HBA) architecture that addresses a spectrum of issues— including reliability, availability, scalability, performance, and backward compatibility—the full benefits of virtualization cannot be realized.

That makes the choice of HBA architecture critical to providing storage administrators with the freedom to align costs with performance based on business metrics. For that reason, openBench Labs will be examining multiple SAN fabrics for Fibre Channel and iSCSI in a variety of virtualization environments, including Xen and VMware.

For this lab review, we examined QLogic's SANblade QLE2462 HBA for PCIe in a Xen Hypervisor virtualization environment with SuSE Linux Enterprise Server (SLES) 10. We were primarily interested in the HBAs' ability to scale along all I/O dimensions.

To test the HBA's ability to deliver maximum levels of full-duplex I/O across multiple Fibre Channel ports, we used Texas Memory Systems' RamSan-400 solid-state disk (SSD). With no mechanical parts, an SSD has no rotational or seek latency to slow throughput, so all I/O occurs at direct memory access (DMA) speed. No latency also means no differences will occur when accessing data randomly or sequentially, which maximizes I/O throughput and simplifies I/O modeling. Nonetheless, we had to target three logical drives across separate 4Gbps Fibre Channel ports on the RamSan-400 and run three simultaneous instances of our I/O benchmark to saturate the QLE2462 HBA, which was installed in an Appro XtremeServer.

We managed our SAN fabric from a 4Gbps QLogic Fibre Channel SAN switch. We monitored I/O throughput at the switch's ports connected to the HBA. From the port perspective, bytes transmitted correspond to bytes requested by a server read command, and bytes received correspond to bytes sent by a write command.

Paravirtualization

Earlier tests performed with Windows 2003 Server demonstrated that the QLE2462 is capable of simultaneously supporting read-and-write throughput at 400MBps. While few single workloads will generate that level of throughput, 10 simultaneous virtual workloads could begin pushing those limits. Xen's design goal calls for supporting upwards of 100 virtual machines on a single server, and to do that Xen implements a technology dubbed paravirtualization.

Through paravirtualization, Xen replaces hard-to-virtualize processor instructions with procedure calls to the hypervisor to provide equivalent functionality. As a result, Xen presents a virtual machine abstraction that is similar, but not identical, to the underlying hardware. Guest operating systems run a special Xen Linux microkernel with Xen-aware device drivers, which are all part of the SLES 10 package.

To provide hardware support for paravirtualization, both Intel and AMD have introduced processor lines that directly support a guest operating system in a paravirtualization environment without requiring that the guest OS have a modified kernel. The Appro XtremeServer that we used in testing was configured with two dual-core AMD Opteron 2220SE processors, which feature virtualization.

We used the openBench Labs oblDisk benchmark to generate I/O streams, which were equally divided between reads and writes, to set the baseline performance for the QLE2462 on SLES 10. Each stream was directed at a RamSan virtual disk, which was independently connected to a 4Gbps port on the Fibre Channel switch. It took three simultaneous oblDisk processes to saturate the QLE2462 HBA. Throughput for both reads and writes converged at 397MBps, which pegged total I/O at 794MBps.

While the oblDisk benchmark generates I/O over a range of block sizes, the number of I/Os per second (IOPs) is set by the way Linux and the HBA drivers bundle I/O requests. Unlike Windows, Linux bundles small block I/Os into large 128KB requests. Fibre Channel HBA drivers on Linux bundle I/Os into 256KB blocks. As a result, the number of IOPs will be approximately four times the throughput rate. With throughput converging on 794MBps for one port in our tests, the total number of IOPs converged to 3,176.

Total throughput scaled linearly when we utilized the second port on the QLE2462 HBA. For this test, we ran six copies of our oblDisk benchmark program with I/O spread across both HBA ports associated with each logical disk on the RamSan-400. On these tests, both read-and-write I/O throughput climbed to about 790MBps, which confirms the independent ports are well-isolated.

In our final series of tests, we reconfigured the memory in the RamSan-400 as a single logical drive. Partitions were assigned to Xen virtual machines for system and user storage. Essentially, these logical SSD partitions were virtualized as very high-speed ATA drives

Under normal server loads, the two Xen virtual machines put negligible stress on the HBA. Nonetheless, any virtualization plan will proliferate virtual servers with virtual I/O connections. That, in turn, drives the need for HBAs that can deliver outstanding full-duplex read-and-write performance. In this regard, the QLE2462 is able to scale to theoretical wire speed as the number of I/Os is increased.

Under normal server loads, the two Xen virtual machines put negligible stress on the HBA. Nonetheless, any virtualization plan will proliferate virtual servers with virtual I/O connections. That, in turn, drives the need for HBAs that can deliver outstanding full-duplex read-and-write performance. In this regard, the QLE2462 is able to scale to theoretical wire speed as the number of I/Os is increased.

Under normal server loads, the two Xen virtual machines put negligible stress on the HBA. Nonetheless, any virtualization plan will proliferate virtual servers with virtual I/O connections. That, in turn, drives the need for HBAs that can deliver outstanding full-duplex read-and-write performance. In this regard, the QLE2462 is able to scale to theoretical wire speed as the number of I/Os is increased.

More importantly, this throughput scaling takes place when applied to just one port on an HBA: Often, system administrators configure HBA ports in a conservative active-passive fail-over scheme. This level of throughput scaling is also particularly applicable for the growing use of virtualization to consolidate server resources. An emerging trend is to run upwards of 10 virtual machines on a multi-processor server, which virtualizes I/O for each virtual machine through the host's HBA.

Jack Fegreus is CTO of openBench Labs (www.openbench.com). He can be reached at jack.fegreus@openbench.com.

(1a) RamSan-400 SSD

Click here to enlarge image

null

(1b) SAN switch


openBench Labs monitored real-time I/O traffic statistics at both the (1a) RamSan-400 SSD and the (1b) SAN switch as simultaneous read-and-write I/O throughput scaled to wire speed for a single port. From the perspective of the SAN switch, which is completely isolated from data-caching issues, both read-and-write I/O throughput was at full wire speed.
Click here to enlarge image

null


Running six oblDisk processes simultaneously, both read-and-write throughput scaled to about 775MBps.
Click here to enlarge image

null


openBench Labs configured two virtual machines, each running SLES 10 with a modified Xen microkernel. In normal operations, the total I/O traffic to the RamSan-400 SSD generated by the two virtual machines placed a minimal I/O load on the shared QLE2462 HBA.
Click here to enlarge image

null

openBench Labs Scenario

null

UNDER EXAMINATION

4Gbps Fibre Channel SAN

WHAT WE TESTED

QLogic SANblade QLE2462 HBA

  • PCI Express interface
  • Dual 4Gbps Fibre Channel ports with auto-negotiation
  • SCSI initiator, target, and initiator/target modes
  • HBA- and target-level fail-over
  • Persistent binding
  • LUN masking

HOW WE TESTED

Texas Memory Systems' RamSan-400 SSD

  • Supports up to 400,000 IOPs
  • Supports up to 3GBps throughput
  • Four dual-port 4Gbps Fibre Channel ports

SuSE Enterprise Linux 10

  • Support for Xen virtualization
  • QLogic 4Gbps Fibre Channel driver support for SLES 10 and Xen virtual machines

Appro XtremeServer

  • Two dual-core AMD 2220SE CPUs
  • 8GB DDR2 ECC RAM
  • PCIe: (1)16x
  • PCI-X: (1)133MHz

QLogic SANbox 5602 Fibre Channel switch

null

Benchmarks

oblDisk

KEY FINDINGS

  • The QLE2462 HBA scaled to wire speed limits with increased I/O operations using one or both Fibre Channel ports.
  • Drivers for the QLE2462 are available for I/O virtualization with the Xen Hypervisor.
  • The QLE2462 supported system and user I/O for multiple virtual machines under Xen as a virtual ATA drive.


Comment and Contribute
(Maximum characters: 1200). You have
characters left.