Within the data center, SAN throughput is driven by link bandwidth, but as SANs are extended over distance, latency must be factored in.
BY ANDY HELLAND
For many years, storage area networks (SANs) were deployed primarily inside the data center. Largely based on Fibre Channel, most SANs were confined to one room or perhaps one building. As the need for high reliability and availability increases, the demand is becoming greater for SANs that have mirror sites separated from the primary data center. This
protects data from natural disasters by keeping an additional set of data in a geographically separate location.
The question that we have to face is how to build storage networks that extend the existing SAN fabrics to remote SAN islands without crippling performance (data throughput).
What is latency?
Latency is simply "delay." In this context, it is the delay that a signal receives when traveling from the initiator (in the primary data center) to the target (at the mirror site). Communication in the reverse direction also suffers from the same latency.
To understand how latency affects performance, we need to look at how SCSI works and how SCSI interacts with Fibre Channel. SCSI controls the actual movement of block data in a SAN. Fibre Channel carries the SCSI commands and data across the SAN. The table illustrates a typical SCSI sequence for a READ command. A SCSI READ command is issued from the initiator. This command requests 64KB of data from the target. Now, we must wait (latency) for the signal to propagate to the remote location. Then, the remote disk must locate the correct data on the disk and transmit it back to the local data center. Again, we will have to wait (latency) for the first byte of data to travel back to the initiator. Once the first byte arrives, the rest of the data is pipelined, and we will see a continuous flow of data based on the available bandwidth of the system.
Total throughput can be calculated by looking at the total amount of user data that has been moved and then dividing it by the total time it took. In this example, the total amount of data moved is 64KB. The total time has two components: latency (while waiting for the request to travel to the target and waiting for the first byte of data to travel back) and the transmission time (total amount of data divided by the link bandwidth). The sum of these two is the total time required to send the data.
Traditionally, as we evaluate system throughput, we only think of the bandwidth component. If the latency is small (it is essentially zero in the data center), the throughput is approximately equal to the bandwidth. However, as you extend SANs to remote sites, you have to consider the effects of latency on performance.
The figure shows data throughput for a single user as the latency is increased from 0 to 3,000 microseconds. The bandwidth of the link in this performance model is constant at 800Mbps. Notice how quickly throughput degrades as the latency is increased. Increasing the SCSI sequence block size from 8KB to 64KB significantly improves the throughput, but there is still dramatic degradation as latency increases.
Until now, we have discussed the performance degradation for a single SCSI transaction as latency is increased. These effects are very real and reduce the throughput as seen by users. However, using command queuing, SCSI can allow the initiator to make multiple requests of the same target at the same time. By pipelining requests, overall use of the link can be raised to approach the bandwidth of the link itself. Nonetheless, if we can drive as much latency as possible from the system, we can significantly improve the performance of the system as perceived by users.
Where does latency come from?
Storage networking latency has three components-distance, equipment, and protocols-each of which can contribute to the overall latency of a storage network.
Distance. Regardless of bandwidth, it takes a certain amount of time for a signal to travel from one place to another. For a fiber-optic cable, the signal propagation time is approximately 5 microseconds per km. Furthermore, consider the fact that the cable routes will never run in a straight line between networked SANs. There will always be detours to avoid freeways, rivers, and other obstructions. At the same time, cable runs between distant cities will be optimized to touch as many of the available cities along the way. As a general rule, we can add 25% to 50% to the linear distance between two locations. As an example, the table on p. 28 shows the expected latency from San Jose to several cities.
Fibre Channel gateway equipment. The equipment that comprises a Fibre Channel gateway also introduces latency into the storage network. Simple devices such as Fibre Channel-to-DWDM gateways merely modulate the Fibre Channel signal onto a particular wavelength and deliver it to another location. As such, they introduce little additional latency (10 microseconds or less). On the other hand, a complex multi-protocol switch with numerous line cards and physical interfaces can introduce significant latency (hundreds of microseconds or perhaps even milliseconds).
Protocol latency. Both Fibre Channel and the emerging iSCSI protocols provide low-latency, high-performance interconnects within a local data center. However, depending on the protocols used to extend them, there may be significant impacts to the networking latency.
Given the extent of IP-based networking today, it is natural to encapsulate Fibre Channel onto IP for highly scalable transport to other data centers. This type of activity is under way as part of the IP Storage working group in the Internet Engineering Task Force (IETF). The protocol is called Fibre Channel over IP (FCIP). As part of that standard, TCP is specified for the protection of data transmitted over routed IP networks. This is a good thing even though TCP/IP can be one of the worst offenders when it comes to adding latency. To be fair, it is actually not TCP/IP that causes the trouble. It's the combination of TCP with routers that drop IP packets when they become congested. TCP's job is to "sweep up" after the routers that drop packets and, in fact, TCP provides a great service by ensuring all packets are delivered reliably. However, in the course of cleaning up after "lossy" IP routers, TCP adds considerable latency to the delivery. If TCP detects an error (e.g., due to a dropped packet), the sending node will have to retransmit the packet (another trip through the pipe). Furthermore, the sending node will have to wait until it knows that the receiver did not get the packet (even more latency). In addition, TCP's congestion avoidance algorithm will reduce the window available to the system for sending data, which will reduce the available bandwidth to the system as well.
How is iSCSI affected by latency?
Some people argue that iSCSI will provide a universal replacement for Fibre Channel that will allow companies to transparently extend SANs over great distances. This belief comes from the fact that the Internet currently operates over large distances and with a large number of nodes. In fact, the performance of iSCSI over distance should be similar to that of Fibre Channel. After all, both are ultimately governed by the same mechanism: SCSI itself. No amount of encapsulation or new protocols can change the fact that we still need to ask for data (SCSI READ) and then wait for its return. If we transmit requests over a lossy router, we will suffer the same TCP retransmission effects. Requests-and responses-will be delayed just as much as those for FCIP.
Latency increases with distance and generally slows down communication. Even infinite bandwidth cannot counter its effects. However, we can make choices that will allow us to live with increased latency.
We should extend SANs far enough to accomplish our goal (e.g., tolerance of natural disasters) but no farther. If performance is an issue, try to keep the remote sites as close as possible.
Avoid lossy networks if possible. Sometimes, the solution calls for the scalability advantages of an IP routed network. If that is the case, FCIP or iSCSI will allow connection to a routed and highly scalable IP infrastructure. If performance is paramount, use a non-routed interconnect methodology like Fibre Channel over ATM, Fibre Channel over DWDM, or Fibre Channel over SONET. SANs that use iSCSI can benefit as well since they will not experience the dropped packets that are normally associated with routed transport.
Choose a gateway with low latency. Gateway latency is every bit as detrimental to performance as distance latency. It doesn't matter from where it comes: All that matters is the total amount of latency between the SAN islands.
If possible, use larger block sizes for the SCSI sequence. The throughput for a single user can be dramatically improved by increasing the block size of the transfer.
We also need to look at the other side of the equation when it comes to managing latency. If we know we must transmit data over a high latency network or over a lossy network, be sure not to purchase too much bandwidth. In recent years, the price of bandwidth (to the Internet or point-to-point) has steadily declined and will continue to do so in the future. However, it is not yet free. There's no point in wasting a dedicated broadband ISP connection if performance will be starved because of latency.
It's clear that SANs will continue to get farther and farther apart. There are many good reasons for this such as protecting data from disasters. Unfortunately, distance means latency. There really is no way to escape the laws of physics that govern the propagation of light or electricity. However, we can manage the process and minimize the effects of latency. We can choose gateway equipment and protocols that minimize additional latency, and locate data centers as close together as possible while still satisfying fault-tolerance requirements. Above all, simply being aware of the causes and effects of latency can lead to managing its impact.
Additional information about the IETF and the IP Storage working group can be found at www.ietf.org/html.charters/ips-charter.html. For information about carrying Fibre Channel over ATM or Fibre Channel over SONET, check out www.t11.org. Select "projects" and then "Fibre Channel-BB2."
Andy Helland is director of product management at LightSand Communications (www.lightsand.com) in Milpitas, CA. He is a member of the ANSI T11 standards group that specifies Fibre Channel and Fibre Channel Extensions. He is also one of the authors of the IETF standard for carrying Fibre Channel over IP (FCIP).