When does a storage area network make sense, and what are the critical deployment factors?
There is a lot of hype about storage area networks (SANs), but under what circumstances do they make sense? An ideal SAN environment is one in which a large amount of data is shared using fast transfers between multiple systems.
One common scenario is "sneakernet." If your current workflow includes the time-consuming process of copying data from one system to disk or tape and then moving the data to another system, a SAN may be the answer. The benefits can be easily determined by comparing the costs of implementing and maintaining a SAN with the time savings.
Or, perhaps data needs to be continuously shared between multiple workstations. In a satellite feed, for example, data from a satellite is stored to disk at the same time another workstation wants to read and process that data. In a SAN, Fibre Channel`s high bandwidth can be divided between the two workstations, providing 30MBps to 50MBps sustained performance rates to each workstation, compared to LAN rates of less than 12MBps.
A SAN can also greatly improve overall access rates in multiple-workstation environments. When the bandwidth of a network interface, such as Fast Ethernet, delivering 12MBps is divided among 10 clients, the access rate for each client is less than 1MBps.
A SAN also makes sense in situations where direct-attached storage transfer rates are needed. If you have a mission-critical application that requires the entire bandwidth of the local storage interface, a SAN can be configured between two workstations to provide redundant access to the storage in the event of a system failure. Assuming Fibre Channel is the storage interface, one workstation can access the storage at consistent, sustained rates approaching 100MBps. If the primary workstation or a storage path fails, the redundant workstation can be used to obtain the same direct-attached rate.
In sum, a Fibre Channel-based SAN:
Provides a higher overall bandwidth (100MBps) to divide among workstations.
Uses storage protocols, which are more efficient with large data accesses than network protocols. Using storage protocols, a 1MB transfer can be performed as one operation. Using network protocols, the same data transfer is broken into many small frames due to the limits of the data portion of an IP frame.
Can use switch technology to access underlying storage. In doing so, an entire 100MBps Fibre Channel connection can be made available to a single or small number of workstations for performance approaching that of direct-attached storage.
Once you`ve determined that a SAN can benefit your workflow, the next question is putting a SAN together (i.e., SAN-sharing software, storage interface and infrastructure, and, in most cases, a network interface and infrastructure).
Then, of course, there is the issue of interoperability. To make sure all the components work together, first you need to consider what platforms will be attached to the SAN. This will likely limit your choices for SAN software, since few SAN packages support a large number of operating systems. If your platforms allow a choice of SAN packages you`ll want to compare features (see InfoStor, August Special Report, "SAN software, phase 1: storage sharing," pp. 14-19).
Consider how the SAN software stores data on the storage device. Some packages use a standard file system such as NTFS; others use a custom file system. Both have benefits. For example, a standard file system allows storage devices to be accessed without SAN software, while a custom file system can provide better tuning for particular applications.
Other software considerations include:
Does the SAN package support redundant "metadata servers"? The metadata server handles locking and file placement functions for the SAN, and can be a single point of fail-ure. If a non-redundant metadata server fails, it causes downtime for the entire SAN until another system is configured to be the metadata server.
Does the SAN software require any third-party software? For example, an NFS package may be required to share between Unix and NT systems.
Ask the SAN vendor if your application is qualified with their software. The vendor should have tested many applications, or have access to lab facilities to perform such tests.
Consider how the shared storage is attached to the SAN client. If the storage is shared as a network device, the SAN client cannot be a file server and share the device with other non-SAN clients.
Another consideration is the storage interface and infrastructure of your SAN. Software is typically independent of the storage interface, meaning that any storage interface could be used, such as parallel SCSI or Fibre Channel. In practice, Fibre Channel is the interface normally used for SANs.
When choosing the storage infrastructure, the following should be considered:
Performance. A hub-based Fibre Channel configuration limits the total bandwidth available to all workstations to a maximum of 100MBps. In contrast, a switch-based configuration provides up to 100MBps to each workstation, depending on the underlying storage device.
Fibre Channel adapters. Ideally, the same adapter should be used in all workstations. In practice this may not be possible, since the desired adapter may not have device driver support for the mix of operating systems in the SAN. Additionally, the adapter may support the operating system, but may not support the switch.
Storage devices. To select the proper storage device, the workflow for the SAN workstations must fit the storage configuration, or performance bottlenecks can occur. For example, if a switch is used to provide multiple 100MBps paths to each workstation, and a single Fibre Channel device is attached, the performance benefit of the switch is negated. Multiple Fibre Channel adapters and striped arrays may provide higher bandwidth to critical workstations on the SAN, while other workstations may access the striped devices using a single adapter.
Connections. The desired distance of the workstations and devices must match that of the components. Some adapters and devices have copper connectors, while others have optical. Copper connectors limit distances to a few meters, but the adapter or device may support an MIA (Media Interface Adapter), which will convert copper to optical. Optical interfaces may be multi-mode or single mode, allowing a connection distance of a few hundred meters to 10 kilometers.
Communication between SAN clients and the metadata server. The SAN software may require a network interface and infrastructure to communicate with the server. If it does, Ethernet or some other existing network interface can usually provide this functionality. If this is a new installation, stringing dual wires for the network and storage interface may be avoided by using an adapter that supports both network and storage protocols on the Fibre Channel interface.
Also, if the metadata server saves the SAN`s metadata on a local disk, you may want to use mirrored disks or a RAID device, so a disk failure does not mean lost data. In most cases, the metadata server can also act as a client. If it does, the application load on the metadata server may affect overall SAN performance. Also, client-mode applications could crash a non-redundant metadata server, thereby affecting the entire SAN.
Interoperability of Fibre Channel components. Your SAN vendor should have tested the various components to determine if they work to-gether. Changes to the firmware or driver of one component may result in problems with other pieces of the SAN.
Fibre Channel hub-based configurations with mixed adapters can be a problem when loop initialization protocols (LIPs) are not masked.
In this case, powering up one workstation may interrupt the SAN for a few seconds. This can be a potential problem for applications requiring real-time or sustained performance. A switch can avoid this problem through zoning. The adapters would be isolated from each other, zoning the switch so each adapter could communicate only with the attached storage devices.
On the downside, Fibre Channel can`t be also used as the network interface between the SAN clients and the metadata server.
Additionally, when multiple workstations are powered up simultaneously, LIPs may interfere with the operating system`s process while it is trying to discover the attached storage devices. If this happens, the workstation will need to be rebooted. Fibre Channel "hard addressing" can minimize the problems, or to avoid this situation altogether, the workstations can be powered up individually.
Price and performance are two other issues to consider when implementing a SAN. The sharing software and hardware components are expensive.
Software can range from a few hundred dollars to several thousand dollars per workstation, while Fibre Channel adapters, hubs, switches, etc., are at a price premium due to their high-performance capabilities. Thus, the cost of a SAN client can easily exceed that of a low-end workstation.
As for performance, the use of Fibre Channel doesn`t necessarily guarantee 100MBps throughput. For optimal performance, the SAN must be designed to meet the needs of the attached workstations. For example, if several SAN clients require an aggregate throughput of 100MBps and are simultaneously accessing a single Fibre Channel RAID-3 array, multiple requests will cause mechanical latencies, greatly reducing throughput.
In this SAN, a JBOD or RAID-5 array might be substituted to reduce mechanical latencies, or perhaps a striped array configuration with multiple Fibre Channel paths could increase available bandwidth.
Also, the system hardware should support SAN requirements. For example, sufficient system memory should be installed--preferably much more than the minimum required by the SAN software.
You should also ensure that the Fibre Channel adapter takes full advantage of the workstation`s PCI bus. If your workstation has 64-bit PCI slots, you should use a 64-bit PCI adapter rather than a less expensive 32-bit PCI adapter. This will avoid bottlenecks at the workstation and will improve overall SAN performance.
SANs can be very beneficial in the right environments, but can also be very complex to implement. You may want to get professional services help from a SAN vendor or a third-party integrator. In many cases, there are no black and white choices with respect to the right SAN software and hardware components. What is a good fit for one environment may not work in another.
A hub-based SAN limits total bandwidth to 100MBps, while a switch-based configuration provides up to 100MBps to each client.
Additional data paths from the client systems allow parallel data transfers to SAN-attached storage for improved bandwidth.
Ed Soltis is a principal engineer at Ciprico Inc. (www.ciprico.com), in Plymouth, MN.