Fibre Channel Enters the Mainstream
Here`s how early adopters are taking advantage of the speed, capacity, and configuration flexibility of the emerging interface.
By John Haystead
Fibre Channel continues to roll along in new and more widespread applications. Although speed is still its greatest attraction, perhaps the best measure of the technology`s success is the extent to which developers and users are looking for new and better ways to put it to work. While some users are just getting their feet wet with Fibre Channel drives, arrays, hubs, and switches, others are already pushing the envelope. Fibre-Channel-based storage area networks (SANs), for example, have dawned. Once considered to be a strictly storage-related technology, these SANs may actually have much wider implications for the overall design and implementation of networks.
Wentworth Printing Co. (West Columbia, SC) is a commercial printing firm doing work for advertising agencies and other corporate accounts. The company`s production department uses 14 Macintosh workstations for electronic pre-press work. Initially, all of the workstations worked from local external drives, linked via a 10BaseT Ethernet network. However, extensive downtime waiting for files to transfer from one machine to another was costing them about 14 hours a day, says Jim Doar, Wentworth president.
"Clearly we needed a better solution. One reason to move on was simply the cost of buying external drives for every workstation," says Tom McGuire, Wentworth systems administrator. McGuire was fixed on Fibre Channel from the outset. "With the kind of job turnaround times we have, the pressure is for speed, and
I wasn`t going to work with anything over a 10- or 100BaseT network again."
Wentworth opted for Augment Systems` (Westford, MA) AFX 410E Storage Server, which holds up to 200GB of RAID-3 storage and connects up to 25 Mac or Windows NT clients. Wentworth currently has 45GB partitioned into three 15GB drives and expects to add capacity. "It will be a very straightforward process of just plugging in additional drives," says McGuire, who also plans to implement a Fiber Channel switch to tie in a separate graphics system network.
McGuire says he hasn`t seen anything else that can move files as fast as Fibre Channel; in fact, he has spec`d transfer rates 300% faster than 10BaseT. As a real-world example, Doar points to the need to immediately download three days` worth of data from a failed drive. "Normally this would have required roughly eight hours to transfer, but the Augment system was able to do it in 30 minutes over Fibre Channel."
McGuire says he hasn`t experienced any of the network lock-up problems associated with some FC-AL systems when a workstation goes down. "At most, an operator may experience a two-second delay on their next access, and we can reboot the down machine without impacting the rest of the users."
Wentworth could have bought something else for half the price, but "it just wouldn`t have been as fast." The Augment Fibre Channel system has resulted in major productivity savings. According to Doar, "Total time lost to file transfers now adds up to at most an hour a day." The installation and conversion was completed over a weekend, with no software or hardware problems.
Drilling for Data
Spirit Energy 76, in Sugar Land, TX, is an oil and gas exploration/production firm. A business unit of Unocal Corp., Spirit operates 300 offshore drilling platforms and approximately 2,700 active wells. To manage and track its widespread operations, the company implemented Fibre Channel storage across a number of applications, including financial management and seismic data interpretation and processing.
Spirit Energy currently has seven Clariion FC5500 FC-AL RAID arrays with TriWay storage processors. Each FC5500 is in a tower together with multiple (typically 4 or 8) FC5000 disk array chassis. Each FC5500 rack holds up to ten 9GB Fibre Channel drives and can be attached to as many as 11 additional FC5000 disk arrays to potentially provide more than a terabyte of total capacity. The storage towers are linked to Sun Enterprise 10000, 4000, and Ultra 6000 UNIX servers. Spirit`s total on-line storage capacity has already topped 4TB and is growing rapidly.
Clariion`s FC5000 arrays are based on the company`s Multidimensional Storage Architecture (MSA), an FC-AL framework for enterprise network storage. This allows system administrators to manage the entire storage pool from a single NT workstation using Clariion`s Navisphere software tools. The entire storage architecture was configured by Andataco Inc. (San Diego, CA).
Although in effect a point-to-point configuration, two Gadzoox Networks (San Jose, CA) 9-port hubs are inserted between each tower and the host bus adapters. Clariion supplies the hubs; Jaycor (San Diego, CA), the S-bus adapters; and Emulex (Costa Mesa, CA), the LightPulse PCI Fibre Channel adapters.
Although the dual 9-port hubs currently provide redundancy, most of the benefits of the hub configuration will be realized in the future when they will be used to support clustering configurations, says Bill Miller, Andataco strategic accounts manager.
The FC-AL system was first beta tested by Spirit`s seismic-processing group whose database is now 100% converted to FC-AL. According to Steve Clark, UNIX system coordinator, they have achieved sustained access rates of over 21MBps (read) and access times under 13ms from a single logical unit.
Although the seismic data interpretation group was not the first to convert to Fibre Channel, it is now the largest user with over 3 of the 4TB of installed FC-AL storage. Spirit`s Oracle financial databases are being converted to FC-AL as additional capacity is needed. So far, about 15% of the database and warehouse data has been converted to Fibre Channel.
"Fibre Channel may not be where you need to start, but for any shop running into the same constraints that we had, Fibre Channel provides an answer, and in fact may pay for itself in the end by extending the useful life of servers that would not be able to handle the capacity growth with SCSI attachment."
According to Clark, Spirit did not encounter any significant problems while installing Fibre Channel. "We were ahead of the curve in implementing the Clariion system, which resulted in more hardware and firmware releases than we would have liked, but nothing that we wouldn`t have expected from a new technology, and none that caused us any cost in time or capital."
In particular, Clark is impressed with Clariion`s NT-based Navisphere storage-management utility, pointing to its ease of use and broad configuration and performance-reporting capabilities. He does note, however, that the tool could benefit from a higher-level reporting capability (specifically, controller versus disk).
Although Spirit has not yet benchmarked throughput performance, Clark says the company has seen calculated bus rates in excess of 50MBps. "We`re not yet sure where the bottleneck is, but our best guess is that it`s at the HBA level, and not a restriction of Fibre Channel." But "for our applications and the amount of active data we have," adds Clark, "disk service times and attachable capacity are currently much more critical than peak bus throughput."
According to Clark, faster data access, increased capacity, and simplified cabling and connections all contributed to the company`s decision to move to Fibre Channel. Improved bus-error handling also proved to be an advantage. "Fibre Channel is immensely more forgiving than SCSI," says Clark, "and the easier manageability of the fiber cable alone was worth the conversion effort." Clark points to the benefits achieved over earlier configurations of 150GB to 200GB SCSI arrays, each with its own host controller. "Now we have fully populated Fibre Channel towers with close to 700GB serviced by two fibre connections, and when we move to the 18GB Fibre Channel drives, this will be well over 1.3TB."
Spirit is looking ahead to a full SAN implementation, particularly for its seismic applications. "It makes sense when you consider the amount of data that users want to keep on-line versus the amount of data truly accessed by the same users. "We`re not just looking at disk storage, however, our plans extend to sequential access devices."
Fibre in Fashion
Burlington Coat Factory, in Burlington, NJ, maintains all of its merchandising data and its financial databases on Fibre Channel storage. Processing as many as 200,000 invoices monthly, its databases are about 1.2TB--with 3x mirroring, closer to 4TB.
Access speed and throughput requirements first led Burlington to Fibre Channel, though physical advantages (e.g., longer cable distances) were also a consideration. According to Michael Prince, Burlington CIO, "Fast data access benefits everyone from accounts payable to distribution, but our most critical requirement was for rapid analysis of merchandising data."
Burlington has three Sequent (Beaverton, OR) NUMA-Q 2000 UNIX servers running Oracle databases. The servers connect multiple Pentium Pro quad SMP systems via a shared memory interconnect. Each server hosts two Brocade 16-port Silkworm Fibre Channel switches which, in combination with Sequent`s DYNIX/ptx operating system, support multipath I/O. Sequent`s current implementation provides redundant dual-fabric resource domains, with each domain providing Fibre Channel attachment to up to 8TB of storage. Each 16-port switch operates at less than two-microsecond latency per frame and a 640MBps transfer rate.
NUMA-Q systems will eventually support up to four dual-fabric resource domains with eight Brocade switches and a total of 32TB of multi-path accessible storage. In total, the eight switches will provide up to 5.1GBps of I/O bandwidth. The switches can also be cascaded to attach hundreds of terabytes of storage to a single machine or cluster.
Each server is currently connected to its own assigned disk storage via dedicated fiber links. So far, Burlington "hasn`t had a great deal of success with multi-pathing," acknowledges Prince. "We`ve configured it a couple of times, but because of stability problems, we`ve backed away from it for now." Down the road, Burlington plans to configure all of the storage as a single shared fabric. "This capability will become more significant as we start using shared file systems and/or if we decide to cluster our servers."
Burlington uses Sequent`s P-Bay SCSI JBOD storage. Each NUMA quad has two PCI-to-Fibre Channel host adapters with one adapter interfaced to each of the two Silkworm switches for redundancy. Outside each Silkworm port, a Sequent Fibre-Channel-to-SCSI bridge fans out to 8 Fast/Wide SCSI ports. Because of overall SCSI performance limitations, the system was configured so that there are no more than six drives on each of the SCSI channels. All six drives are visible to both Fibre Channel switches. Sequent will support Fibre Channel drives later this year, which will attach to the Brocade switch via FC-AL ports.
The Sequent servers come with Veritas Software`s (Mountain View, CA) Volume Manager software for disk management and mirroring functions. According to Prince, "The volume manager is used to oversee the database and to fine-tune the disk farm." The power of the switches couldn`t be maximized without this middleware.
"Volume Manager enables data mirroring or striping across disks and enables users to group disks and dynamically move data volumes without shutting down the system and without the need for an additional intelligent disk subsystem," says Prince.
Burlington also plans to implement additional Fibre Channel features such as cascading switches and Fibre Channel disks as they become available. "But it will be a strategy for new disk subsystems, as opposed to swapping out our current hardware," comments Prince.
Prince recalls that the overall conversion to Fibre Channel was "more or less" transparent to the existing hardware and software. Prince`s recommendation to anyone considering a move to Fibre Channel: "If you need the throughput, do it."
IBM Global Services (IGS) stores and manages mission critical data for a number of global companies such as Merrill Lynch, General Electric, and Sears. Located at Research Triangle Park, NC, IGS moves hundreds of terabytes of data on a daily basis from remote buildings within the RTP site to a large primary tape backup center two miles away.
To accomplish this task, IGS uses Ancor Communications` (Minnetonka, MN) 16-port GigWorks MKII Fibre Channel switches linked to IBM SP/2 and RS/6000 computers. Ron Howell, IGS network architect, says he`s "achieving 94% of Fibre Channel`s theoretical speed of 100MBps per loop." This compares to the 60% to 70% efficiency Howell says he has seen with ATM-based systems. Howell likes the switches` auto-sensing feature and support of Class-I Fibre Channel service.
Data is carried via single-mode fiber to the remote storage sites, with each path consisting of four fibers for a total of 4Gbps per path. During the backup process, different servers take turns on the network. Disk storage is primarily IBM`s SSA and SCSI RAID-5 arrays, although IGS is migrating to Fibre Channel drives. (For more information on IGS` Fibre Channel implementation, see "IBM Global Services Relies on Fibre Channel," InfoStor, May 1998, p. 50.)
Part of the University of Minnesota`s Department of Astronomy, the Laboratory for Computational Science and Engineering (LCSE) conducts large-scale computational fluid dynamics simulations of various astrophysical phenomena.
Much of the simulation is done at the National Center for Supercomputer Applications (NCSA) at the University of Illinois and at Los Alamos National Laboratories. To illustrate the enormity of the number crunching involved, by year-end Los Alamos will have roughly 6,000 processors and 100TB of disk storage, with an individual simulation requiring up to 10TB of data.
While the scale of computation is enormous, "the real problem is not generating the data but how to store and view it," says Thomas Ruwart, LCSE Assistant Director. "To accurately visualize the data, we need both lots of storage and very fast access to it."
The LCSE has SGI servers and workstations with SGI and Prisa Networks (San Diego, CA) Fibre Channel adapters. The lab has a total of 132 Seagate 9GB Barracuda Fibre Channel drives configured in a set of MTI (Anaheim, CA) StorageWare Fibre Channel JBOD enclosures. Each StorageWare rackmount chassis holds 12 drives and up to six chassis can be mounted per cabinet. The lab also has four Ciprico (Plymouth, MN) RF7000 Fibre Channel RAID arrays, each with nine 9GB Barracudas.
The lab displays its complex animations on a high-resolution projection display called the "power wall." To play back the animation at 10 frames per second, the system must be able to read at 60MBps for each quadrant, or 240MBps for the entire display. According to Ruwart, they can easily sustain this data rate using the four Ciprico disk arrays with four separate Fibre Channel connections into an SGI Onyx machine.
Since the laboratory must rearrange and reconfigure its systems on nearly a daily basis, it is acutely aware of incompatibility issues between products and vendors. Ruwart says "it`s very similar to the early days of SCSI." In addition to issues between adapters, switches, and hosts, the lab has encountered such off-beat problems as a PCI host bus adapter that would only work with one vendor`s cable. Although the situation is improving steadily, Ruwart still says it`s prudent to work with one vendor or integrator as much as possible.
To avoid the headaches of continually reconfiguring connections, the lab plans to link all of its systems on a single Fibre Channel switch. Currently, they are working with an Ancor switch because, according to Ruwart, "It`s the one switch we have right now that supports arbitrated loop ports on the fabric."
But, although the Fibre Channel switch can provide access to any of the Fibre Channel drives from any of the computers on the host fabric, "making data available on an exclusive or shared basis is an entirely different problem."
For example, the LCSE is in the process of directly linking its Onyx systems to the Ciprico arrays, which will allow them to simultaneously run two different simulations from the same arrays. Although they would prefer to do this by connecting both computers to the array through a single Fibre Channel switch rather than duplicating all of the Fibre Channel hub/adapter connectivity, the SGI Fibre Channel adapters on the Onyx2 are not currently able to communicate with a Fibre Channel switch.
True shared-access capability will require the development of new file system software. As explained by Alex Elder, LCSE system architect, "Today, for multiple computers to simultaneously read and write files on the same disks without stepping on each other, all the disk accesses have to go through one host using the traditional NFS protocol. Thus, if another host needs access, it has to contact the serving host, creating a bottleneck."
Instead, Elder says each host must be given direct read/write access to shared data." As a result the lab is interested in the idea of sharing disk drives through a network-attached switch rather than connecting them to servers and having the servers talk to each other. Says Ruwart, "The problem with NFS is that, although it worked great for Ethernet, when you get into gigabit and multi-gigabit networks, it doesn`t scale." The LCSE is developing shared file systems that scale in performance and capacity as well as connectivity.
The LCSE is also conducting Fibre Channel scalability experiments to determine actual performance limits in various topologies. According to Elder, one test fully populates a Fibre Channel loop network with the maximum number of devices possible and measures the effects on performance. Another test measures the maximum transaction rate under various conditions. The lab provided initial data from these experiments to the Fibre Channel community last month.
Pacific Ocean Post (Santa Monica, CA) is a large film-effects post-production house that has worked on such films as "Independence Day" and "Titanic." To play some movie files back with true 2K x 2K film resolution sometimes requires transfer rates as high as 500MBps.
POP`s five major divisions are all attached via class 3 FC-AL, with somewhere between 60 to 100 attached nodes and individual network-attached storage clusters. The company has a large number of SGI workstations connected via Prisa Networks HBAs, Brocade 16-port Silkworm switches, and Fibre Channel arrays from Ciprico and Box Hill Systems (New York, NY). According to Brett Cox, POP systems architect, the company already has 4TB to 6TB of on-line storage capacity--and that number is growing daily.
POP`s storage configuration is a 144GB Box Hill Fibre Box array running an internally developed driver. Each Fibre Box holds eight 18GB Seagate Cheetah drives and, according to Cox, by striping two of the arrays together, they can achieve sustained transfer rates of 400MBps. POP`s software also incorporates a middle-layer product that allows users to easily set up and assign various RAID levels on a project basis.
Cox says Fibre Channel provides the only realistic solution to his throughput requirements. Prior to Fibre Channel, achieving this transfer rate would have required 12 to 14 SCSI controllers, an accompanying number of attached disk drives on each controller, all striped across one giant logical volume--"a configuration and management nightmare."
Although Cox is now a strong advocate of Fibre Channel, he`s also cut some teeth along the way. Though POP was one of the earliest implementers of Fibre Channel, originally working with Ancor`s quarter-speed equipment, Cox says "It`s only over the last seven or eight months that I reached the point where I was confident in using Fibre Channel as my primary network." POP is a now a co-developer with Prisa and Seagate.
Cox is also director of POP`s Technologies Division which writes Fibre Channel application software for the post-production industry. One project resulted in user interface software that allows managers to query the entire network for availability of storage resources and anticipated bandwidth requirements. They can then assign individual access and priority to specific operators and machines--something that can`t be done with SCSI since it`s all host-based or network-attached storage.
POP Tech also developed the first host-software-based RAID controller system for SGI platforms making use of the onboard XOR RAID engine technology provided on Seagate`s Fibre Channel drives. By implementing XOR RAID circuitry, software developers can reduce the number of commands necessary to invoke RAID operations and eliminate the need for an external RAID controller.
Although Box Hill was the first to deliver a product that enabled the XOR logic circuits, POP was not able to use their code since it was NT based and POP works exclusively with UNIX-based platforms. Box Hill plans to use POP Tech`s software in their port of XOR to UNIX.
To address the problem of simultaneously addressing the same drive from multiple UNIX servers, POP Tech is working with the University of Minnesota on its Global File System (GFS). Cox envisions his operation as one large computing facility comprised of a single cluster of systems attached to a switch, with each port attached to a massive RAID array or series of arrays. "If we`re able to do this," he says, "we may actually achieve a diskless workstation environment and eliminate the headaches of locally attached storage."
Cox advises anyone considering switching over to Fibre Channel to be sure they have someone who is very knowledgeable about the network environment. "Then they should be prepared to unlearn a lot of it, because there are so many new ideas and concepts evolving around Fibre Channel that are completely different from traditional networking strategies. Other than that, go for it, the speed is awfully nice."
Clariion`s Navisphere software provides centralized storage management of Fibre Channel SAN resources.
IBM Global Services (IGS) is implementing a new storage networking concept, called the SMARTcentre, which basically positions the SAN as a more fully integrated component of an overall networking backbone. Ron Howell, IGS` network architect, says that most common network configurations place host servers directly between the LAN and SAN, attached on one side via host bus adapters directly to an FC-AL loop or Fibre Channel switch, and to the LAN on the other. This means each server must handle all of the traffic to and from the SAN, effectively turning them into routers and bogging down the network. As pointed out by Howell, this not only leads to routing issues with the server as well as introducing more points of failure, but basically makes the SAN just another network for system managers to deal with.
In contrast, the SMARTcentre approach offloads all of the SAN routing tasks to a high-speed central switch that sits directly on the LAN, controlling and directing all traffic to and from the hosts to the SAN. The SMARTcentre is connected to the Fibre Channel fabric via a dual Fibre Channel link. Two SMARTcentres are implemented to provide fault tolerance.
The SMARTcentre includes Computer Network Technology`s (Minneapolis) UltraNet Storage Director, a high-speed switching platform with a dual Fibre Channel interface attached to the Ancor GigWorks MKII switch.
All storage traffic has a single IP address direct to the SMARTcentre, and all SAN traffic stays within the SAN, without having to cross the LAN. This also provides for increased data security since by removing the servers from the data path, the SMARTcentre is also able to serve as a firewall requiring specific authentication before providing access to storage resources.
John Haystead is a freelance writer in Hollis, NH, and a frequent contributor to InfoStor.