Are TOEs a panacea for backup performance?

By Saqib Jang

The challenge of ever-shrinking backup windows for exponentially growing data is expanding beyond central data centers to departments across most corporations. Data production across many corporate departments is estimated to be growing at rates approaching 100% per year due to regulatory requirements, growth in e-mail traffic, rich media content, etc. The business-critical nature of applications such as e-mail, databases and groupware, and the global reach of most corporations, is reducing the time window available for backup from days to hours.

Client/server backup basics
Traditional LAN-based client/server backup supported by applications such as Veritas' NetBackup, Legato's NetWorker, and Computer Associates' ARCserve continues to be the most commonly used practice for tape backup in departmental environments. In this method, one server on the LAN is designated as the backup media server. While it can be any server on the network that has been designated to play the role of backup server, it is typically a dedicated server used only for backup.

This method provides a centrally managed backup environment. The backup media server manages backups for application servers on the network; only a client agent must be installed on the remote backup clients. This solution requires a single license for tape backup software and enables amortizing the cost of tape libraries or autoloaders across many servers, thus making it easier to justify the cost of tape automation. In addition, this architecture supports heterogeneous environments since the backup clients do not necessarily have to be running the same operating system as the backup server; for example, a Windows server can function as the backup server for a network of Linux application servers.

TCP/IP and backup performance
LAN infrastructure in general, and TCP/IP protocol processing specifically, play a significant role in the performance and efficiency possible using client/server backup. While having multiple TCP/IP-based Gigabit Ethernet LAN segments can broaden the backup data path, no matter how fast backup data enters into the backup server, the TCP/IP protocol processing capability of backup and application servers can become a bottleneck.

Extrapolating from Sun's testing of Veritas backup software shows that IA-32 backup servers require 8MHz of CPU power for every megabyte per second (MBps) of backup data moved to or from the network, while each MBps of data moved from disk to tape (or vice versa) requires 5MHz of CPU capacity. For example, an IA-32 server that needs to back up a number of clients over the network to local tape at a rate of 30MBps would need 390MHz of available CPU power, with 240MHz of the CPU capacity taken up by TCP/IP protocol processing. As another example, an application server that needs to back up a database residing on local disks to a remote backup server at a rate of 30MBps would need 390MHz of available CPU power, again with 240MHz of CPU power consumed by TCP/IP protocol processing.

Another pitfall of client/server backup is the mismatch between tape speed and network speed. Today's tape systems can write data at a uniform rate of 20MBps to 30MBps per tape drive. While Gigabit Ethernet LANs can theoretically feed data to the backup server at up to 110MBps, in reality TCP/IP protocol processing limits application servers to forwarding data at a fraction of that rate. While LAN-based TCP/IP communication is typically intermittent, tape drives such as DLT systems require continuous, streaming data from the backup server. When the tape drive starves because data is arriving over the network slower than the tape drive is writing, it leads to a condition where the tape drive continuously stops, rewinds, and restarts while waiting for additional data to arrive from the LAN, significantly reducing performance.

A common approach to balancing tape speed and network speed is the use of multiplexing, or feeding parallel data streams from application servers to the backup server, enabling the tape drives to be fully utilized. However, multiplexed streams require a significant amount of CPU resources on backup servers, again essentially driven by protocol processing I/O requirements.

An alternative is to replace the client/server backup architecture with a Fibre Channel SAN-based backup architecture. While Fibre Channel SANs have a number of attractive features for backup--including removal of heavy backup network overhead on LANs and network servers, as well as reliable, gigabit-level backup operations--the expense and difficulty of deploying Fibre Channel SANs make it an unlikely option for most corporate departments.

Enter TOEs
More than a dozen companies--including Adaptec, Alacritech, Astute, Broadcom, Emulex, Intel, iReady, Layer N, QLogic, Seaway, Silverback, Trebia, and Xiran (a division of SimpleTech)--have introduced products or announced plans in the emerging TCP offload engine (TOE) area. Products include TOE network interface cards (TNICs), iSCSI host bus adapters (HBAs), and TOE chips that offload TCP termination from host CPUs. For a Gigabit Ethernet connection, terminating TCP in hardware versus software can improve server performance by up to 50%. While in the longer term, IP storage using iSCSI HBAs is potentially a larger and more-lucrative market for TOEs (IP storage relies on iSCSI, a protocol that sits on top of TCP), the server market has an immediate need for TNIC products to improve performance of applications such as file serving and client/server backup.

For backup, deployment of TNICs (currently ranging from slightly less than $500 to almost $1,400) in backup media and application servers could relieve the backup network-processing bottleneck, without the cost of deploying Fibre Channel SANs. TNICs have the potential of significantly increasing throughput to storage devices for processing of large amounts of data. In addition, backup media and application server resources are freed up for additional application processing.

"Medium to large IT departments are using our accelerators in their backup servers for multi-drive tape backup servers, as well as tape libraries and new disk-to-disk backup appliances," says Joe Gervais, director of product marketing at Alacritech. "TNICs can offer significant benefits for customers when multiple systems are archiving data simultaneously to a backup server with multiple tape drives."

The University of Pittsburgh Medical Center (UPMC) is one site that found significant gains in overall backup performance and efficiency through the use of Alacritech's TNICs. Prior to TNIC deployment, UPMC was struggling with the labor- and time- intensive nature of a distributed local tape backup process, need for capacity expansion on an aging server population, and limited capital budget for infrastructure improvements. Alacritech's TNIC accelerator cards have enabled UPMC to centralize network backups and dramatically reduce backup windows, while extending the life of its server infrastructure.

"We're using Alacritech accelerators to implement a networked backup strategy that would enable us to move from local tape backups to a centrally managed, IP-based backup approach," says Kevin Muha, technical lead in UPMC's Network Server Group. "We're backing up at least 5TB of data in 12 hours rather than in four days."

A growing approach to improving backup performance, as well as capitalizing on the falling costs of hard disk drives, is the use of disk-based backup systems, which are typically used as a staging area before copying data to tape in a two-step backup architecture. Adaptec, for example, is targeting this market with its ASIC-based, full-offload TNIC. "Backup and restore windows continue to be a key pain point of IT managers, and TOEs help alleviate this issue," says Ram Jayam, vice president and general manager of storage networking at Adaptec. Adaptec's internal testing has shown that its TNIC cards provide more than a 30% throughput improvement compared to standard NICs, while freeing up server CPU cycles for disk I/O use, which significantly reduces backup/restore times.

The entry of new players in the TOE market should result in substantial declines in TNIC prices. Ryo Koyama, CEO of iReady, says his seven-year-old company has shipped more than 250,000 TCP/IP offload engines, primarily in embedded devices. iReady's recently announced EthernetMAX product is a combination TOE-enabled Gigabit Ethernet NIC and iSCSI HBA. "Addressing volume applications such as network backup will require widespread TOE deployment," says Koyama.

Saqib Jang is a principal at Margalla Communications, a Woodside, CA-based strategic and technical marketing consulting firm focused on storage networking. He can be contacted at saqibj@margallacomm.com.

This article was originally published on August 05, 2003