If IP SANs are going to enter the enterprise big leagues, iSCSI disk arrays will have to incorporate advanced technologies.
By Arun Taneja
IP networks running the iSCSI protocol to connect servers and storage arrays present a compelling value proposition: consolidating network infrastructures into one and using a common skill set to manage both Ethernet and storage networks. Yet storage networks were built on Fibre Channel for a reason. Fibre Channel guarantees sequential packet delivery and has minimal overhead. And with list prices for director-class Fibre Channel switches at $1,500 per port (or less), Fibre Channel edge switches less than $500 per port, and host bus adapters (HBAs) between $500 and $1,500, moving from Fibre Channel to iSCSI for cost reasons alone is a debatable proposition. With iSCSI unproven outside of the small to medium-sized business (SMB) market, iSCSI storage array architectures will need to evolve to compete for market and mind share in the enterprise storage space.
iSCSI’s value proposition
The ability to run iSCSI over Ethernet allows companies to use existing low-cost resources and skill sets, providing the following benefits:
- Hardware products such as Ethernet switches and network interface cards (NICs) are available from a variety of vendors and feature interoperability, low cost, and “free” software such as device operating systems and drivers;
- The addition of iSCSI to the IP protocol allows for communication between servers and storage devices without introducing new technologies and personnel;
- New storage-specific monitoring and management tools can be added to existing network management frameworks with minimal cost and/or additional training; and
- Every company, outsourcer, and integrator has staff with some level of experience with IP networking.
Despite the widespread deployment of Ethernet/IP, end-user adoption of IP SANs remains small and is primarily limited to SMBs and early adopters in departments of larger organizations.
Organizations such as the Franklin W. Olin College of Engineering and Mesirow Financial are representative of the type of organization that deploys iSCSI. The Franklin W. Olin College of Engineering has 220 students with a mix of 65 Windows and Linux servers, while Mesirow Financial has 1,000 employees. In both cases, the iSCSI deployments occurred in cost-conscious environments where features such as virtualization, replication, and snapshots were important but performance was not. In these instances iSCSI arrays (from EqualLogic) were deployed to support applications such as Exchange in Mesirow Financial’s environment and file services with storage-intensive requirements such as MP3 and AVI files at the Franklin W. Olin College of Engineering.
Neither of these deployments can be confused with the performance-intensive applications that typify larger enterprise environments. Performance-intensive databases push more than 50MBps with thousands of I/Os on a single Fibre Channel port. Using iSCSI disk arrays and 1Gbps Ethernet in these types of applications would almost certainly create a performance bottleneck, especially if array ports were shared with other appsclications.
To be truly enterprise class, IP storage architectures must overcome issues both outside and within the storage array.
Certain conditions exist in TCP/IP networks that will remain outside of the control of iSCSI storage array vendors but must be accounted for in array designs to ensure a successful implementation. The two primary factors are oversubscription and the Ethernet “Untouchables.”
Most existing Ethernet routers use a crossbar switch fabric as an architectural foundation. This design provides for a high-speed interconnect between router ports and works well for the random nature of TCP/IP traffic. In IP SANs, however, the contention between servers for a specific storage port produces a type of network congestion called oversubscription.
Alleviating the oversubscription problem using network-based techniques requires that IP SAN switches support technologies such as fabric speedup and virtual I/O queues. These types of next-generation technologies appear in the latest carrier class switches, but they add significant cost to the IP SAN equation.
The other requirement for IP SANs is to maintain the current Ethernet infrastructure as much as possible by using existing TCP/IP and iSCSI drivers, network cards, cabling, hubs, and switches. Because these components are readily available at low cost, companies will be reluctant to pay a premium for any of these items or to change anything once the initial IP SAN installation is complete and stable.
IP storage arrays
For iSCSI storage arrays to become viable alternatives to Fibre Channel arrays requires that the arrays address existing faults in TCP/IP, such as the inability to
- Pro-actively anticipate and respond to oversubscription;
- Guarantee TCP flow;
- Prioritize packet importance;
- Guarantee consistent data throughput; and
- Guarantee high-bandwidth data streaming.
An iSCSI disk array should also handle processing of the IP stack that occurs when transmitting and receiving packets.
To overcome these hurdles, iSCSI disk arrays will need to evolve in terms of how they manage the TCP protocol. For example, these subsystems will need improved TCP/IP drivers to manage TCP traffic. Currently, both servers and arrays increase the size of packets to maximize performance. While this approach works well in a one-to-one relationship, it breaks down in a networked storage environment when multiple servers access a single port on a storage array.
Existing TCP drivers on storage arrays compound the problem as they repeatedly seek to accomplish the same objective as the servers: to maximize performance. This method of gradually improving performance has two major pitfalls. First, current Ethernet switch design leads to switch ports becoming congested, resulting in TCP packets being dropped and performance slowing down as packets need to be re-sent.
The other major pitfall is that the storage array is unaware of the oversubscription problem it is creating. With IP drivers designed to maximize throughput, the array’s TCP drivers do not adjust to the multiple requests. This creates an eventual breakdown of the logical network path to the iSCSI array’s Ethernet port. This scenario repeats itself as the TCP driver on the storage array immediately seeks to re-establish contact with the server and begin the process of maximizing throughput once the connection is re-established, once again creating an oversubscription problem.
An iSCSI storage array targeted at the enterprise needs to optimize, rather than maximize, TCP/IP flow. One way to do this is through source-based flow control. In this scenario, the TCP driver on the storage array can still initially maximize throughput between the array and the first server that attaches to it, but would need to monitor the status of other servers connecting to that storage array port. As more servers connect, the TCP driver needs to proactively optimize and balance the flow of network traffic to the servers. This source-based flow control would continue as more servers access the port or as servers drop off and no longer need access to the port.
Closely tied to source-based flow control is the ability for iSCSI storage arrays to maintain and prioritize multiple incoming requests by implementing quality of service (QoS) functionality. The storage array needs to have a mechanism to recognize, classify, and respond to specific application-based network traffic. In doing so, the storage array could differentiate between the incoming data flows and respond with the appropriate I/O and throughput levels.
QoS features will likely be policy-based and set up by the storage administrator. The IP storage array could automatically classify I/O requests according to assigned QoS based on initiator IP address, target port IP address, or LUN. While IPv4 headers could be marked with IP Precedence (DSCP) bits, iSCSI packets are typically switched at layer 2 at the switch level. In other words, Gigabit Ethernet switches do not examine the packet’s layer 3 headers. Alternatively, DSCP bits could be mapped to the 802.1Q field in the layer 2 VLAN TAG so that the GbE switch could apply priority queuing for specially marked packets. Unfortunately, GbE switches in general have limited priority queuing support and they are limited by the number of outbound queues. In addition, either the host HBA driver or the inbound switch port is required to mark the packet headers. In any case, although limited priority queuing could be implemented at the network level, it is not that straightforward to apply in practice. In effect, the burden would lie on the storage array to identify and prioritize packets of incoming data flows and handle them accordingly.
With the data flows from multiple servers accessing the same array port, the existing TCP driver can no longer handle packets in the traditional fashion and still be efficient. Five network layers exist between Ethernet and SCSI, and data packets must be copied between each layer. Copying each incoming and outgoing packet between layers requires a high degree of CPU processing that is currently done by the central processor on the storage array. Although it is a small issue when just a few servers access the same storage port, this configuration becomes a significant bottleneck when the storage array port must service data streams from multiple servers.
Addressing this issue requires a “Zero Copy” feature in the storage array architecture. A Zero Copy feature eliminates the need for the array processor to be involved in moving data from the network interface to the array’s application buffer when it is received, or vice versa when it is sent. Early users of Zero Copy technology report up to 40% reductions in wait time, up to 20% reductions in CPU overhead, and in some cases nearly double the throughput.
Providing guaranteed native throughput will address the final barrier to implementing iSCSI disk arrays at the enterprise level. This type of high-bandwidth throughput is required to maximize the performance of tape drives. Interruptions or inconsistent data flows interfere with the ability of tape drives to perform optimally. With the ability to connect 1Gbps, 2Gbps, and forthcoming 4Gbps tape drives to IP networks via inter-switch links, iSCSI storage arrays must be able to send and receive large amounts of data with minimal or no interruption to optimize tape drive performance. Failure to maintain high data rates during backups will result in “shoe shining,” a stopping, backing-up, and going-forward action that has a negative impact on the performance of tape drives and backup times. Similarly, restore times will be negatively impacted.
To date, iSCSI has gained some traction in SMB and department environments. Winning either mind- or market share at the enterprise level will require iSCSI array vendors to break some significant barriers.
Look for iSCSI arrays to enter enterprises indirectly, much the same way that modular array vendors entered large enterprises by targeting SMBs and departmental users. EMC’s recent addition of iSCSI capabilities to its low-end and mid-tier Clariion line of arrays-joining Network Appliance, Hitachi Data Systems (HDS), IBM, and others-provides evidence that this approach will be used to put iSCSI technology on a gradual journey to the enterprise level.
However, this trek will be a slow one. The continuing per-port price drops in Fibre Channel and the availability of 2Gbps and 4Gbps speeds give Fibre Channel a performance edge that iSCSI will not be able to match until 10Gbps Ethernet becomes widely and inexpensively available.
Only when Ethernet costs come in at a substantial discount (50% or more) in comparison to Fibre Channel, and performance is not a major concern, will iSCSI get the nod.
With 10Gbps Ethernet just around the corner and a few start-ups such as NeXT Storage working on eliminating some of the issues with TCP/IP and developing enabling technologies for 10Gbps iSCSI arrays (expected by year-end), the days of iSCSI in the enterprise are drawing closer.
Ultimately, the final barrier for iSCSI to break through into the enterprise will likely be a psychological rather than technical issue. This will require more than just financial justification but technically reliable and independently documented benchmarks along with a growing list of successful installations. But for organizations that are always looking to cut costs and capitalize on existing resources, the emergence of high-performance, low-cost iSCSI arrays and 10Gbps Ethernet infrastructure will be appealing. But don’t hold your breath, as these solutions are still on the horizon. In the meantime, evaluating and deploying current iSCSI solutions in the appropriate environments will give you a reference point on which you can make future decisions.
Arun Taneja is a consulting analyst and founder of The Taneja Group (www.tanejagroup.com).