Some of the claimed advantages of IP/Ethernet storage area networks may be overstated.
BY TOM HEIL
Today, there is a major debate occurring in the storage, networking, and computer industries over whether Fibre Channel or Gigabit Ethernet is destined to rule the storage area network (SAN) market. Fibre Channel is seemingly the only viable option today, but the SCSI-over-IP community claims Gigabit Ethernet will soon challenge Fibre Channel-some going so far as to predict Fibre Channel's eventual demise.
The Gigabit Ethernet SAN value proposition sounds great. You can build and manage SANs with mature, interoperable, inexpensive, commodity LAN technology. You can leverage existing infrastructure, management techniques, and IT personnel.
These are impressive claims, but can Gigabit Ether-net deliver? All new technologies go through what Gartner Group calls the "technology hype cycle" (see Figure 1). Recall the early days of Fibre Channel when it was going to do everything (e.g., SAN, LAN, MAN, peripheral I/F, and audio/video). Fibre Channel has clearly found its "plateau of productivity" in storage, but like most technologies it fell short of "one-wire-does-everything" expectations.
As pervasive as Ethernet is in LANs, significant challenges must be overcome before it is ready to take on Fibre Channel in SANs. In an attempt to sort hype from reality, this article takes a close look at these hurdles and the Gigabit Ethernet SAN value proposition.
Ethernet SANs defined
IP storage, or SCSI-over-IP, implies block-level I/O over any IP network. This is distinct from network-attached storage (NAS), which exports a file rather than block I/O access. To complete the picture, you need a transport protocol above IP and a mapping of the SCSI command set to this protocol. Several companies have developed proprietary solutions, and the Internet Engineering Task Force (IETF) is developing a standard called iSCSI, which maps the SCSI command set to the TCP/IP protocol stack.
Figure 1: All new technologies, including Fibre Channel and Ethernet SANs, go through the "hype cycle."
In practice, SCSI-over-IP has two distinct applications, referred to in Figure 2 as "SAN-to-SAN connectivity" and "Native Ether-net SANs." SAN-to-SAN connectivity implies linking individual SANs over long distances for applications such as remote mirroring, data replication, and disaster recovery. The long-distance IP connection can be any MAN or WAN technology (public or private) that exists between sites (e.g., Ethernet, T1/T3, and dense wavelength-division multiplexing). Today, linked sites typically belong to one enterprise, but in the future, an enterprise may use an IP MAN to access a storage service provider (SSP). (Today, SSPs for the most part put their storage on the customer's premises.) The relevant attribute of SAN-to-SAN connectivity is that the core SAN technology within a given site is Fibre Channel, not Ethernet. The SAN-to-MAN/WAN bridge occurs at the edge of the core SAN.
SAN-to-SAN connectivity is an essential, critical business function, and there is little argument that IP networks will play an important role here. (Although Fibre Channel is steadily evolving, it's not clear when or even if it will become a "long-haul" technology.) Proprietary sol-utions from several vendors are already available, and standardized "Fibre Chan-nel tunnel" solutions that transport Fibre Channel frames across TCP/IP distance connections should be on the market within the year. iSCSI distance solutions should follow, delivering a more flexible any-to-any "mesh" model, versus tunnels, which only support point-to-point connections.
Although SAN-to-SAN connectivity is a critical function, it is not the primary focus of this article, as there is little controversy over the important role IP is expected to play in long- distance applications. Rather, the focus here is on native Ethernet SANs-a much more controversial topic.
Figure 2: SCSI-over-IP, or IP storage, can be implemented in a SAN-WAN/MAN-SAN configuration or as a native Ethernet SAN.
In a native Ethernet SAN, Ethernet rather than Fibre Channel is the core SAN technology, consisting of Gigabit Ethernet host adapters, switches, and (block-level) storage systems (e.g., JBOD, RAID, and tape). Traditional LAN host adapters are often called network interface cards (NICs), so this article calls SAN-optimized Ethernet adapters "Storage NICs" or "sNICs." In its simplest form, a sNIC is just a LAN NIC with a SCSI "re-director" host driver that intercepts block I/O requests from the file system and diverts them to a drive on the other side of a TCP/IP Ethernet fabric.
Several such products have already been announced, including IBM's TotalStorage IP Storage 220i and 3ware's Network Storage Unit. The general consensus is that these early-market, software-only solutions do not have the performance or scalability needed to compete with Fibre Channel in the data center. The primary culprit is the TCP protocol, which, when executed in software, not only adds significant latency but also consumes tremendous host CPU horsepower, especially at gigabit-plus speeds. High CPU utilization limits scalability and makes the CPU unavailable for other applications.
The performance limitations of these initial software solutions are reflected in the way they are being positioned and priced. IBM specifically states that its product is not for data centers but for "sub-enterprise" workgroups and departments that need shared access to block-level storage over an existing LAN. (Fibre Channel is considered too costly and complex to play here.)
Some believe that if native Ethernet SANs can establish a foothold in these lower-end, potentially higher-volume segments, then it's only a matter of time before technology maturation and economies-of-scale enable them to grow up and compete with Fibre Channel in the enterprise. However, others question whether such a "sub-enterprise" SAN market even exists. They argue that the sub-enterprise is better served by the NAS architecture, which not only leverages the existing LAN but also facilitates data sharing and collaboration. If the latter view prevails, these early-market Ethernet SAN products may flounder and potentially sour some players from making the necessary investments to grow the technology into a true data-center contender.
Is TCP up to the challenge?
As discussed, host-based software TCP is a real performance challenge. There is a general consensus that hardware TCP (in which TCP processing is offloaded to adapter hardware) is needed in gigabit-plus backbone applications, be it a LAN NIC or SAN sNIC. (At the desktop, software TCP and 10/100 Ethernet will dominate for many years to come.)
However, there are some who argue that even hardware TCP will not be competitive against Fibre Channel in the data center. TCP, they claim, was designed for reliable communication across unreliable, congested networks-a task it does so well that it enables the Internet. However, this focus carries with it overhead unnecessary in robust, secure data-center storage networks. Even those who believe TCP is the right choice generally acknowledge it is a sub-optimal data-center protocol. They argue that other benefits of Ethernet SANs will more than compensate for lower performance, and that in the long run, Ethernet will outperform Fibre Channel due to a more aggressive road map to higher wire rates. (10Gbps Ethernet at 50% efficiency is still faster than 2Gbps Fibre Channel at 90% efficiency.) Note that this controversy only applies to TCP in the data center. There is agreement that TCP is appropriate for long-distance applications.
TCP also carries with it significant complexity, and it's not yet clear what hardware TCP will do to NIC/sNIC cost structure at gigabit-plus speeds. Can you hide the cost of incremental silicon gates? Probably. Can you hide the cost of incremental memory? Probably not. However, it is clear that hardware TCP adapters will not be commodity desktop LAN NICs.
It will be some time before the industry gets a complete picture of the true performance and cost of hardware TCP at gigabit-plus speeds. Today, the iSCSI standards momentum clearly favors TCP for both data-center and long-distance applications. Only time will tell if TCP is the best choice in the data center. If there is no performance gap, or any gap that does exist is trivial, this bodes well for Ethernet SAN adoption. A significant gap, however, will favor Fibre Channel and may force the SCSI-over-IP community to consider TCP alternatives in the data center. Nishan Systems, for example, proposes that UDP rather than TCP be used in the data center, claiming performance equal to Fibre Channel without the complexity of hardware TCP.
Hardware TCP may eventually become a standard feature of backbone LAN NICs, but so far Microsoft has been reluctant to relax its grip on the TCP/IP layers and define a driver model that supports hardware TCP. Until this happens, the sNIC market may have to shoulder the cost of hardware TCP on its own. (A sNIC doesn't have the same Microsoft dependency because TCP is buried inside the iSCSI driver.) Long term, it remains unclear whether NICs and sNICs will remain distinct pieces of hardware or whether they will converge into a single adapter that can be used for either storage or communications.
Will SANs go virtual?
Some in the SCSI-over-IP community argue that once LAN and SAN have converged on a common technology, you no longer need a separate physical network for storage traffic. You can do away with SANs in favor of a common fabric shared among public, server-to-server, and server-to-storage traffic. In this scenario, the SAN becomes a virtual rather than physical construct that can be implemented over the LAN, perhaps even over the Internet.
Some small, sub-enterprise operations may be able to work this way but not the data center. Physically distinct SANs play a critical role and will exist regardless of whether the underlying technologies converge. The primary drivers of a separate SAN are performance and data security. A SAN offloads storage traffic from the public network and enables LAN-less and serverless backups. Just as important, in a SAN the front-end servers act as a firewall between the public network and corporate data. (Servers are expendable, but data is not.) SANs exploit the characteristics of a physically secure, trusted back-end network to conduct high-speed operations involving multiple servers and storage elements without need for time-consuming functions like authentication and encryption every step of the way. These functions are handled at the edge of the SAN in a front-end server. For example, incoming encrypted data is decrypted upon receipt, before it is processed and/or stored.
A common misperception about SSPs is that they offer storage across the Internet. In actuality, (block-level) SSPs either locate their storage on customer premises or locate it "near your cage" in a collocation facility. In either case, the SSP's storage is a simple extension of the customer's Fibre Channel SAN. SSPs look to iSCSI to eventually overcome the "same building" constraint and offer service over emerging high-speed IP MANs, but even then the expectation is that the SSPs' storage is still an integral part of the customer's private SAN backbone network, via secure, dedicated (physical or virtual) channels that offer predictable performance. Few companies are likely to put the congested, unpredictable Internet between them and their data.
In addition, before you can consider doing block I/O over a public network, you need security features unnecessary in a back-end network. Since file systems today assume an intimate, trusted relationship with the block storage they manage, security features would have to be implemented in the iSCSI driver or sNIC. The iSCSI initiative is defining a security model, but it's not yet clear if it will find broad application.
Leveraging existing infrastructure
Can you really use existing infrastructure when you build an Ethernet SAN? The early-market software solutions targeting the sub-enterprise may be able to sit on an existing LAN, but this is not a data-center architecture. When you build a data-center SAN, you're building a physically distinct back-end network-new adapters, switches, storage, and cabling-regardless of whether its Gigabit Ethernet or Fibre Channel. The claim that you can use existing wiring, for example, is misleading. It's true that 1000BaseT Ethernet allows LAN wiring running through your walls to be upgraded to the Gigabit Ethernet data rate, but this is irrelevant in a SAN, which exists in the data center. Leverage then comes not so much in using LAN infrastructure already in place, but in being able to draw on a common pool of components for both LAN and SAN. So then how common is this pool of components?
It has been argued that, unlike Fibre Channel, which requires new technology top-to-bottom, with Gigabit Ethernet most everything you need is available and cheaper via volume economics. Let's test this for each element of a SAN.
Host adapters and drivers: Data center sNICs will bear little resemblance to today's Gigabit Ethernet LAN adapters. At a minimum they will feature hardware TCP. They may include an I/O processor capable of managing complex block I/O tasks start-to-finish without host CPU intervention (a standard attribute of modern Fibre Channel and SCSI adapters). Finally, a SCSI driver is needed to link the adapter to the host file system. (LAN drivers are irrelevant.) In other words, new and potentially complex adapters and drivers are needed. However, it's not clear whether long-term Ethernet sNICs and NICs will become the same piece of hardware.
Storage subsystems (RAID, tape, router): Before the Ethernet SAN market achieves critical mass, major storage system vendors will have to offer products with Ethernet front-ends. You will also need a variety of bridge products that connect the Ethernet SAN to existing Fibre Channel and SCSI devices.
Switches: It's generally believed that Gigabit Ethernet switches can be used in SANs without modification. If this proves true, there is significant leverage to be gained. However, some believe Gigabit Ethernet switches may need some modification to perform acceptably in SAN applications, but even then it's expected that the same higher-performing switches will address both SAN and LAN backbone markets. If it turns out that IP SAN switches evolve into a different class than standard IP switches, leverage is diminished.
Wiring: At the gigabit level, Ethernet and Fibre Channel have converged on a common physical (PHY) layer for fiber and coaxial copper. It's expected that a common PHY will also be the approach for 10-gigabit speeds. (InfiniBand joins in on this 10-gigabit PHY convergence as well.)
For fiber-optic networks, Gigabit Ethernet brings nothing that isn't already there. The one important exception is 1000BaseT, or Gigabit Ethernet over unshielded twisted-pair (UTP) cable. Fibre Channel has no equivalent.
1000BaseT makes obvious sense when you want to upgrade existing premises wiring to gigabit speeds. But will UTP find a role in data-center SANs? Today, Fibre Channel SANs are almost exclusively fiber-optic. If this trend continues with Gigabit Ethernet SANs, then 1000BaseT is irrelevant. If, on the other hand, Gigabit Ethernet SANs rely significantly on 1000BaseT (over Category 5E or Category 6 UTP) this gives Gigabit Ethernet a cost edge that Fibre Channel can't match. Not only is wiring cheaper, but 1000BaseT per-port switch costs are projected to decline faster than fiber-optic per-port switch costs due to unit volumes associated with other applications. One of the market variables likely to influence fiber-optic versus UTP is the road map to 10 Gigabit and beyond. Fiber-optic is more "futureproof" given the "Herculean" modulation schemes required to get high data rates onto UTP. The DSP horsepower and complexity of today's 1000BaseT transceivers are testament to the difficulties down this path. (1000BaseT transceiver costs may significantly moderate the pace at which copper switches and adapters can come down in price.)
SAN management software: SAN management is a complex, multi-faceted topic. For simplicity, this article divides SAN management into two categories: storage/data management and network management.
Storage/data management refers to a host of topics such as backup/restore, archival, data replication, database administration, capacity expansion, storage virtualization, file/volume management, and RAID. The important attribute of storage/data management for this discussion is that although it may care about storage architecture-DAS vs. NAS vs. SAN, for example-it has little to do with low-level details like SAN transport technology. In this context, Fibre Channel vs. Ethernet is irrelevant. From an Ethernet technology leverage standpoint, there is nothing uniquely Ethernet to leverage. On the contrary, storage management applications that are SAN-aware today assume Fibre Channel and will have to be ported to Ethernet as technologies like iSCSI ripen. (Such ports are not expected to be major undertakings.) Ultimately, storage/ data management software vendors will likely take an "interface agnostic" market position.
Network management: In a SAN context, network management refers to managing the actual SAN fabric, dealing with issues like switch/router configuration, device discovery, zoning, paths, and detecting and routing around bottlenecks and failed links. There is significant-though not total-overlap between the problems that must be solved in LAN vs. SAN fabrics. It's clear the networking industry knows how to manage very large Ethernet fabrics, and it's likely this knowledge and software can be leveraged into SANs.
In summary, a native Ethernet SAN potentially leverages LAN fabric elements (switches and routers) and LAN fabric management software and knowledge. Everything else will have to be new.
Is native Ethernet SAN really cheaper?
One of the biggest complaints about Fibre Channel is its high cost. Why use an expensive new technology when Ethernet, by virtue of high-volume economies-of-scale, is so much cheaper? Obviously, IP networks have a significant cost advantage over dedicated facilities in long- distance inter-SAN links. As discussed, IP will clearly play a role here. But within a data center, is a native Ethernet SAN cheaper than a Fibre Channel SAN? Is the delta compelling?
This discussion assumes data-center caliber Ethernet, not the sub-enterprise software solutions discussed earlier. Also, this discussion looks at cost structure, not price. Fibre Channel pricing today reflects the fact that it is the only game in town for SANs. When competitive threats like Ethernet emerge, Fibre Chan-nel vendors will re-spond with more aggressive pricing.
Let's divide SAN hardware into three categories: storage, host adapters, and fabric (switches and wiring). The first thing to point out is that in a SAN, the storage (e.g., RAID systems and tape) is by far the largest cost component, at 60% to 70%. Ethernet can't change this, and there's no reason to believe Ethernet storage will be any cheaper than Fibre Channel storage. The front-end interface plays a trivial role in storage system cost structure, and for storage there is no Ethernet economy-of-scale advantage.
As already discussed, data-center caliber sNICs bear little resemblance to today's Gigabit Ethernet LAN adapters. It's not clear there is any cost advantage, and in fact, there may actually be significant cost disadvantage. A data-center sNIC is new and complex and will not benefit from desktop adapter economies-of-scale. Unless and until sNICs and backbone LAN NICs converge (and it's not clear they will), there is no economy-of-scale advantage at all. For this discussion, adapter cost parity with Fibre Channel is assumed.
So if there's significant savings to be gained, it has to be in the fabric-more precisely, the ability to leverage backbone LAN switches in a SAN. When you examine price-per-port projections of Gigabit Ethernet vs. Fibre Channel over time, the thing that stands out is that there's some but not much difference comparing fiber-optic to fiber-optic. The only substantial difference is comparing Fibre Channel fiber-optic to Gigabit Ethernet copper (1000BaseT). This is where cheaper wiring and volume economies-of-scale kick in. Of course, the advantage of "all fiber-optic" SAN wiring is the flexibility to go with either Fibre Channel or Ethernet, so SAN implementers who use UTP restrict their options. Also, UTP viability beyond 1Gbps is uncertain.
A few back-of-envelope calculations using the above assumptions result in perhaps a 5% to 10% savings for fiber-optic Ethernet SANs and perhaps 15% to 20% savings for UTP copper Ethernet SANs. These savings aren't trivial but aren't really exciting either, especially in a data center where hardware costs pale in comparison to the cost of service interruption. If you look at a SAN's total cost of ownership (TCO) as 25% equipment and 75% management, then any Ethernet cost advantage at the hardware level is almost lost in the noise-perhaps 5% at best.
Cost then doesn't appear to be all that compelling a reason to choose Ethernet over Fibre Channel. If there is a reason, it has to be somewhere else.
Is a native Ethernet SAN easier/cheaper to manage?
This question is critically important but difficult to quantify. The SCSI-over-IP community argues (rightfully so) that managing today's data explosion in an environment of flat/declining IT budgets is a major corporate challenge. The question is: "Does moving to Ethernet help? How much?"
Once again, it's helpful to break the SAN management problem into storage/data management and network management. The storage/data management problem exists independent of SAN network technology. The software and IT staff needed to manage things like RAID configuration, data backup and recovery, and database administration do not go away with an Ethernet SAN.
To find savings (software, human resources, and training) you have to look in the realm of SAN network management (e.g., switch configuration, routing, and VPNs). It's not that you can't manage Fibre Channel switches via your Ethernet LAN and SNMP/IP management console. Fibre Channel switches typically provide and are managed via a dedicated Ethernet port. (However, this "out-of-band" management scheme implies that a separate Ethernet subnet is required to manage Fibre Channel switches.) Rather, it's that the protocols and "management semantics" of Fibre Channel and IP networks are quite different. Your IT staff already knows IP protocols and management strategies. To incorporate Fibre Channel you will have to broaden your IT expertise and resources to encompass Fibre Channel protocols, management strategies, and products.
Although difficult to quantify, this article assumes a "best guess" value based on discussion with industry participants on both sides of the issue. (These numbers are open to debate.) SAN network management (the part Ethernet/IP can help) probably constitutes somewhere between 15% (small SAN) and 30% (large, multi-site SAN) of the total SAN management problem. If you assume in the Ethernet SAN case this cost is hidden (just work your LAN IT staff harder) then this 15% to 30% can be thought of as the incremental cost (software and personnel) to manage a Fibre Channel SAN over an Ethernet SAN.
If you sum the previously calculated hardware savings and the management savings and apply the 25% equipment/75% management TCO assumption, the result is a TCO savings potential somewhere in the 10% to 30% range-a significant savings, but perhaps less than some might have gleaned from early- market hype.
Is an Ethernet SAN more interoperable?
Another complaint about Fibre Channel is lack of interoperability. It's true that Fibre Channel still struggles with multi-vendor interoperability, particularly inter-switch. (Fibre Channel inter-switch interoperability will likely be resolved before Ethernet SANs emerge in the data center.) An interoperability "shake out" phase is inherent to any new technology. But does the claim that Ethernet SANs will have fewer interoperability issues hold up?
The simple answer is "no." Granted, elements of Ethernet LAN technology are leveraged into Ethernet SANs, and it is probably safe to assume that inter-switch interoperability won't be a big issue. But at the total solution level, an Ethernet SAN is a brand new entity and can be expected to have its fair share of early-market interoperability challenges. None of the "pre-iSCSI" solutions on the market now (native Ethernet SAN or SAN-to-SAN connectivity) will work with each other. They all use proprietary protocols. Once a standard protocol emerges, it will still take some time for the market to produce stable, standards-compliant sNICs, storage systems, and SAN applications from multiple vendors and to ensure these all interoperate.
It will be interesting to see if new issues creep up in the transition from host to hardware TCP/IP. Host-based stacks may be slow but have been around for years and are quite stable. The expertise behind them resides at the operating system vendor. Several new (hardware or embedded firmware) TCP/IP stacks are about to enter the market. Will these new entrants be able to get it right first time out?
It is quite possible that Ethernet's interoperability advantages have been seriously overstated.
How strong an incumbent will Fibre Channel be by the time Ethernet SANs achieve critical mass?
One of the biggest challenges Ethernet SANs face is time. Ethernet SANs aren't competing with today's Fibre Channel; they are competing with the Fibre Channel of three years out, which will be more mature, competitive, and pervasive. Figure 3 depicts the rate of SAN adoption over the next several years. If this holds up, more than 55% of storage will be SAN-attached by 2003. Since Ethernet SAN technology isn't ready, Fibre Channel will be the near exclusive beneficiary of this ramp.
Figure 3: SAN-attached storage is expected to account for more than 55% of the overall market by 2003.
Although there are benefits to be realized once Ethernet SAN technology is ready, they may be lost on anyone who has already deployed Fibre Channel. To end users who have already installed and mastered Fibre Channel, native Ethernet SANs represent an increase in cost and complexity. On top of an existing LAN and SAN infrastructures, these users would have to incrementally embrace new Ethernet SAN adapters and storage systems and be faced with the challenge of building, validating, and managing a hybrid SAN. Apart from a "forklift" upgrade, these users' matrix of technologies, products, vendors, and interoperability validation would likely get bigger, not smaller.
This says that end users who have already embraced Fibre Channel should be inclined to continue buying Fibre Channel. The larger this customer base is, the greater the probability Fibre Channel will enjoy a long, useful life despite available alternatives. Conversely, the sooner Ethernet SAN technology is ready for prime time, the greater its chance to achieve critical mass, since it may depend on finding footholds in places Fibre Channel has yet to penetrate.
Nishan System's strategy is noteworthy in that it enables Ethernet SANs to be built using Fibre Channel adapters and storage. Strategies like this may make it easier for Fibre Channel sites to gradually incorporate Ethernet SAN technology into their infrastructure, where it may co-exist with Fibre Channel indefinitely.
It's clear that IP networks will play a critical role in SAN-to-SAN distance applications. But what about native Ethernet SANs? Are they a fantasy? Obviously not. Sub-enterprise products are emerging now, and there is significant industry investment attempting to grow the technology into a data-center contender.
But they are no immediate cure-all either. Many early-market claims of improved cost, manageability, leverage, and interoperability don't hold up-or at best are overstated.
Ethernet SANs are not a slam dunk. It will take time and significant investment before they are ready to challenge Fibre Channel. Until then, questions will linger regarding performance, complexity, cost, and what some might deem a marginal value proposition, especially where Fibre Channel already exists.
Meanwhile, if Fibre Channel continues its current exponential growth unabated, by the time Ethernet SANs are ready, Fibre Channel may prove surprisingly difficult to unseat.
Entrenched incumbents are notoriously difficult to dislodge. Even if native Ethernet SANs catch on, it's hard to imagine Fibre Channel won't continue to be a dominant SAN technology through the remainder of this decade.
Tom Heil is a senior systems architect at LSI Logic (www.lsilogic.com) in Milpitas, CA.
This article is adapted from a presentation to be given at the upcoming Network Storage 2001 conference (June 11-14, Monterey, CA). For more information, visit www.periconcepts.com.