By Ahmad Zamer
There are a number of ways to offload TCP/IP and iSCSI protocol processing, and RDMA is on the horizon. By Ahmad Zamer
Given the relatively slow adoption of storage area networks (SANs), some users are apprehensive about new SAN technologies. At the same time, the high degree of interest in IP storage technologies, especially iSCSI, attests to users' thirst for better technologies.
This article looks at some of the current issues surrounding IP storage with an emphasis on interoperability and TCP/IP and iSCSI offloading, including Remote Direct Memory Access (RDMA) technology. The offloading discussion focuses on iSCSI host bus adapters (HBAs) and/or network interface cards (NICs), which are required for optimum performance in an iSCSI SAN.
As with any new technology, interoperability is critical to end-user adoption.
Figure 1: The iSCSI protocol sits between the SCSI interface and the TCP/IP transport interface.
All three IP storage technologies (iSCSI, FCIP, and iFCP) are on standards tracks in the Internet Engineering Task Force (IETF). The IETF standardization process requires multi-vendor product demonstrations that prove interoperability. Vendors that support iSCSI are building products in tandem with the IETF standards process. The parallel efforts of the IETF and the iSCSI vendor community will culminate in a standard that is widely accepted and well supported. This is expected to take place within the next couple months, when the iSCSI standard is expected to be finalized. Users can purchase IP storage products today, and some users have already deployed products such as iSCSI adapters and routers, iFCP gateways, and FCIP gateways.
The IP storage community places a high premium on interoperability. After all, IP storage is based on Ethernet, a protocol that sets a high bar in this area. Vendors understand that the networking community expects a high degree of interoperability, and the IP storage community is working cooperatively to meet that goal.
At last fall's Storage Networking World (SNW) conference, the Storage Networking Industry Association's IP Storage Forum sponsored its fourth interoperability demo—without a prior hot staging. Within one hour more than six vendors had their equipment connected and interoperating without any problems. This degree of interoperability is a testament to the industry's determination to provide the IT community with storage solutions that work as well as Ethernet solutions.
TCP/IP and iSCSI offloading
Offloading is a term that gained prominence with the concept of transporting block-level storage data over TCP/IP networks. The term refers to TCP/IP offloading, iSCSI offloading, or both. Offloading is most often connected with iSCSI HBAs or NICs.
Ideally, iSCSI NICs/HBAs would have unlimited memory and the highest possible speeds to buffer all data and messages—at an affordable price. Until that happens, however, vendors have to work within the limits of bus and memory speeds, memory capacity, protocol inefficiencies, and cost constraints. Add to that the overhead of the operating systems and you have a problem that requires creative solutions. This is where offloading comes into play. Offloading enables HBAs/NICs with limited bus bandwidth and small memories to operate efficiently using protocols with heavy packet processing overhead under operating systems that impose their own set of data movement and processing constraints.
Figure 2: A full iSCSI offload host bus adapter performs all iSCSI tasks on the HBA, including command processing, PDU generation, error-handling, session setup and tear down, and requests and responses.
Offloading shifts the protocol processing burden from the host CPU to a NIC or HBA, saving significant CPU cycles when processing network traffic. With offloading, CPU utilization can be reduced to single percentage points, as opposed to being almost entirely consumed by the protocol processing tasks. While saving CPU cycles, offloading requires a memory buffer on the NIC/HBA that grows in size depending on the amount of buffering that is needed for out-of-order packets or other exception conditions. When a packet arrives out of order, which is a frequent occurrence, the NIC/HBA needs to hold that packet in memory until the rest of the related packets arrive. The buffering requirements are complicated by the fact that the TCP flow is a stream of binary bits with no indication where a message starts or ends.
One promising solution being developed is iSCSI over RDMA, which is discussed later in this article. RDMA enables applications to deliver data to the NIC/HBA buffer indicating the data's final destination. In other words, each packet contains information about its destination that enables the NIC/HBA to send it on its way without having to buffer it or create intermediate copies of it.
This article focuses primarily on "full-offload" options, which offload all protocol processing to adapter cards, as opposed to "partial-offload" approaches that rely in part on the host CPU for protocol processing.
Figure 1 shows the components of iSCSI. As shown in Figure 2, a full iSCSI offload HBA performs all iSCSI tasks on the HBA.
Figure 3: In a Remote Direct Memory Access approach, iSCSI PDU processing and data transfers are handled by iSCSI RDMA extensions.
While this implementation offers full iSCSI offloading, it makes it difficult to scale the architecture to 10Gbps Ethernet and beyond.
It also does not allow for an iSCSI session to span multiple adapters, which may be required in demanding environments. Currently, this is not a significant issue, but it will become more important in the future.
Next-generation approaches are closer to the goal of having TCP/IP offload engines (TOEs) with operating system supported interfaces. This approach makes it possible to develop standard application programming interfaces (APIs) for iSCSI. In addition, this model facilitates migration to 10Gbps Ethernet.
In this approach only SCSI data transfer-related command processing is offloaded, while other tasks such as error-handling and connection setup are left to the host CPU. This implementation focuses on offloading tasks that are overhead heavy and require assistance, while not overloading the NIC/HBA with tasks that can be easily handled by the host CPU.
The industry is moving toward an open solution to the offloading issue that is based on iSCSI over RDMA, which defines iSCSI extensions for RDMA. The technical details are expected to be finalized later this year. The iSCSI-over-RDMA approach will ensure the support of various operating systems, thus taking the "guesswork" out of TOE implementations.
Figure 3 shows a model of an RDMA-capable iSCSI NIC/HBA based on an open interface architecture. As shown, iSCSI PDU processing and data transfers are handled by the iSCSI RDMA extensions.
As the RDMA proposal is defined today, it provides a cyclical redundancy check (CRC) for the transported data, so that it will not be necessary for future iSCSI implementations to provide these functions. The good news is that the move to RDMA-capable NICs/
HBAs will be transparent to users. NIC and HBA vendors only need to add a software layer to their cards to make them compatible with previous implementations.
The IP storage community and standards organizations are making significant progress toward reaching the goal of delivering storage solutions that are interoperable and based on open standards. The cooperation of IP storage vendors, coupled with the rigorous standards process of the IETF, should guarantee that IP storage will reach end users, enabling SANs over Ethernet and increasing overall adoption of SANs.
Ahmad Zamer is the iSCSI subgroup chair of the SNIA IP Storage Forum and a senior product line manager at Intel.