iSCSI and TCP/IP offload engines

TOEs relieve CPUs from protocol processing, but whether or not you need them in an iSCSI environment depends on application requirements.

By Ahmad Zamer

TCP/IP offloading is a networking technology that assigns all, or part of, the tasks associated with the processing of the TCP/IP protocol to entities in the host system other than the CPU, freeing the CPU from protocol processing.

iSCSI is a storage networking encapsulation protocol that enables the transport of SCSI commands and block-level data over Ethernet, extending the reach of Ethernet to storage systems. As TCP off-loading speeds the processing of Ethernet packets, it makes it possible for Ethernet to carry applications such as iSCSI storage applications that include heavy data movement.

Click here to enlarge image

To appreciate the need for offloading, consider the four basic components of a host server: CPU, memory, system bus, and I/O devices (see figure). Ideally, none of the components would create bottlenecks that degrade overall system performance.

Generally, 1GHz of CPU power is needed for handling the bandwidth of a 1Gbps Ethernet (GbE) port. A bi-directional GbE port is capable of saturating the resources of a 2GHz CPU.

A memory subsystem typically requires 3x the bandwidth of the networking component for optimum performance. A full-duplex Ethernet port requires memory with 6x its own bandwidth, and unimpeded network operations require bus bandwidth that is at least equal to the network components bandwidth.

What needs offloading?

Examining estimates of CPU utilization in a system with no offloading features shows CPU overhead (see piechart). Most of the overhead comes from three areas: checksum operations, CPU interrupts, and data movement and memory access tasks. TCP/IP-specific tasks account for a relatively small portion of the total overhead.

Checksum operations are performed on Ethernet packets to ensure data integrity, which burdens the CPU. To overcome this, most network interface cards (NICs) handle checksums in hardware to relieve the CPU.

The architecture of computer systems necessitates interrupting the CPU at packet boundaries when processing streams of TCP traffic. A few years ago, NIC and host bus adapter (HBA) vendors improved the situation by implementing better techniques for interrupts (called interrupt moderation or coalescing), which reduced the number of CPU interrupts resulting from TCP processing.

Data movement activities also cause significant CPU overhead. Local buffering of data requires the use of local memory, which adds to the cost of the I/O subsystem. However, the more complicated aspect of data movement is the placement of data, or the transfer of data from the memory buffers on one system to memory buffers on the destination system. Mechanisms for direct placement of data into remote memory are needed to efficiently eliminate the overhead associated with data movement.

Offloading benefits

The extent of a TCP/IP offload engine's (TOE's) ability to improve system performance depends on how it creates a balance among the various components in the host system. For example, most TOEs include logic to perform checksum operations, fully relieving the CPU from that burden.

Click here to enlarge image

Interrupts are another area where offload engines can improve system performance. In iSCSI environments, offload engines can interrupt the CPU on command boundaries as opposed to the more traditional approach of packet boundary interrupts. With command boundary interrupts, offload engines reduce the impact of interrupts on the CPU, providing a significant performance improvement especially in data-intensive environments with large file transfers.

For data movement, offload engines (implemented on NICs or HBAs) can use local memory (rather than system memory) to buffer data. The use of dedicated local memory increases the cost, but enables data buffering and better flow control while saving main system memory. Placement of data at the destination requires equipping the I/O subsystem with the ability to place data into remote system memory. While there are several ways to achieve this, the industry is developing a standard Remote Data Memory Access (RDMA) protocol that will further enhance data movement and enable 10GbE TOEs.

Later this year, servers will have high-performance serial PCI-Express system buses, and Ethernet controllers will have built-in logic to handle checksum operations and reduce CPU interrupts. And increased system bandwidth and processing power will enable users to use multiple GbE connections in servers without performance degradation. The improved performance will in turn enable better TCP offloading solutions, implementation of the RDMA protocol, and 10GbE (which will require offloading).

Although iSCSI was the major driver behind TOE development, many early adopters are getting adequate performance using standard NICs (without TOEs) and free iSCSI software drivers. However, users with heavier I/O activity may require iSCSI HBAs that provide more offloading capabilities than standard NICs.

Although end-user adoption of iSCSI has been slower than expected, the availability of TOEs, GbE on motherboards, and iSCSI drivers from Microsoft and other vendors has generated increased interest in iSCSI deployment. The next stop for offload engines is 10GbE, which will require even more-efficient offloading and RDMA.

Ahmad Zamer is the acting chairman of the SNIA IP Storage Forum and a senior product line marketing engineer at Intel.

This article was originally published on January 01, 2004