By Saqib Jang
Server virtualization technologies, such as VMware or Citrix's Xen, allow multiple applications to run on different "guest" operating systems and share a physical server. Thus, an application that runs on Linux and another application running on Windows can share one physical server. Such sharing enables both consolidation of servers and improves utilization of server resources.
The benefits of virtualization are obvious, but when administrators start to increase the density of application/guest OS combinations or virtual machines (VMs) in order to maximize utilization, suddenly the proposition becomes much more complex. The latest CPUs from AMD and Intel are more than up to the task of running 10 to 20 or more applications at a time. However, most servers run out of I/O bandwidth well before processing power as only so many Ethernet NICs and Fibre Channel HBAs can be added to a physical server.
For VM infrastructure and operations managers that are pushing the VM density envelope, the latest generation of 10GbE NICs may be a better option. Most VMs individually do not consume the full bandwidth of a single GbE NIC, but VMware and Xen users are quickly seeing that the typical network configuration of an ESX server can run as high as six GbE NICs and two Fibre Channel ports.
Beyond raw network throughput requirements, I/O bottlenecks in the VM Monitor (VMM) or hypervisor are other factors in the way of realizing the full potential of server virtualization. Current virtualization implementations place a very high software burden on the host processor while transporting packets between the VM, the hypervisor and the underlying hardware. The VM has an emulated NIC driver that connects to the hypervisor through a virtual switched network layer. The hypervisor may switch traffic from one VM to another using its software, or it may transport the I/O packet to the external network using its NIC driver to the underlying hardware interface.
Because there are so many software layers, effective I/O bandwidth through a standard virtualized environment is constrained, regardless of the raw bandwidth of the underlying hardware I/O channel. In addition, latency is increased, and there is no ability to control bandwidth allocation across VMs. As a result, many applications in the data center cannot be virtualized.
The latest generation of 10GbE NICs, when combined with new virtualization software features, such as NetQueue from VMware, that allow multiple and flexible receive queues, significantly reduce the delays inherent in current virtualization implementations. Similar features have been present in Citrix XenServer. In addition, with its optimal memory management for network I/O, in which guest memory is pre-assigned to NIC receive queues, XenServer enables NICs to directly DMA into guest memory, eliminating any copies by the hypervisor. This frees up processor resources to support heavier-weight applications on the VMs or run more VMs per server.
"XenServer has been optimized for 10GbE technology for some time, and offers three optimization paths for fast I/O that enable us to achieve about 95% of line rate simultaneously for both transmit and receive, and with the new distributed virtual switch in XenServer 5.5 we can provide rate guarantees and isolation—requirements for multi-tenancy—on a per-VM basis," says Simon Crosby, CTO at Citrix.
"With the emergence of bandwidth-intensive applications and the continued adoption of virtualization by IT managers, " Crosby continues, " the broad adoption of 10GbE technologies and SR-IOV will permit us to run the most demanding workloads and enable data center managers to reduce cabling, switching costs and power consumption, while leveraging a single fabric for both storage and networking."
In addition, 10GbE server networking can eliminate the queuing bottleneck in today's software-based approaches. The current approach creates a single first-in, first-out (FIFO) queue for incoming packets from the NIC through the hypervisor to the various VMs. Since neither the hypervisor nor the NIC knows which packet goes to which interface, the hypervisor must perform substantial packet processing to determine that; it is a processor-intensive task that consumes a great deal of time and CPU cycles.
In contrast, some of the latest 10GbE NICs allow multiple queues destined for different VMs, enabling the NIC to steer packets to the appropriate VM. Rapidly performing these tasks in the NIC hardware eliminates slow and costly software overhead for I/O traffic processing in the hypervisor. This allows more effective use of the host processor for processing applications, while effectively increasing available bandwidth.
The PCI-SIG I/O Virtualization (IOV) Workgroup has released specifications that allow virtualized systems based on PCI Express (PCIe) to leverage shared IOV devices. This includes a specification for Single Root (SR) I/O Virtualization and Sharing. The IOV SR 1.0 specification provides an industry-standard means for allowing multiple VMs (or System Images) in a single Root Complex (host CPU chip set including memory or shared memory) to share a PCIe IOV Endpoint without sacrificing performance. While a number of 10GbE NIC vendors promote their products as being "SR-IOV compliant," support in hypervisors and guest OSes is in process.
Saqib Jang is the founder and a principal with Margalla Communications Inc., www.margallacomm.com. He can be contacted at firstname.lastname@example.org.