Particularly in Fibre Channel environments, one solution is alternate path software, which provides multiple-path management and load balancing.
By Kevin Smith
Windows NT was designed for workgroup and departmental servers that typically host a single application and connect from 8 to 10 disk drives. The I/O architecture was based on the assumptions that servers would implement SCSI buses to connect disk storage, and SCSI disks would remain single-ported devices that connect to a single bus.
Figure 1: The Windows 2000 operating system sees a single LUN as different LUNs because each LUN has multiple Target ID (TID) and/or LUN identifier addresses.
However, as Windows 2000 moves into enterprises, the assumptions change. Fibre Channel is becoming the industry standard interconnect, with addresses for 127 devices on a single FC-AL loop and 16,000,000 addresses on a fabric with cascaded switches. Unlike SCSI disks, Fibre Channel disks are dual ported to enhance performance and fault tolerance and can be accessed over a pair of redundant links. Eventually, Windows 2000 will be enhanced to fully satisfy the requirements of enterprise computing, but until it is, software can be added to the I/O stack to better meet the continuous availability requirements of enterprise applications such as e-commerce.
Windows 2000 uses SCSI FCP (SCSI over Fibre Channel) to access disks across Fibre Channel topologies. Because Windows 2000 supports 128 SCSI targets, the channel can be mapped to emulate multiple SCSI buses. Each SCSI target is mapped to logical unit numbers (LUNs), and LUNs are mapped to physical disk drives.
LUNs are the logical units addressed by the operating system, which identifies them by their Target ID and LUN identifier. When Windows 2000 boots, it scans its I/O channels looking for LUNs, and assumes that each discovered LUN with a unique Target ID and/or LUN identifier is a unique logical disk. A disk object representing each logical disk is constructed and made available to the file system.
At the enterprise level, RAID array control lers generally are configured in fault-tolerant, dual-active pairs to increase system reliability. Each controller has one or more Fibre Channel host ports for connecting to a loop or SAN fabric, and one or more pairs of back-end Fibre Channel storage ports for connecting a set of dual-ported Fibre Channel disk drives over redundant Fibre Channel loops. Each physical disk is mapped to a LUN, and each LUN can be configured with an affinity to each controller host port. In this way, all of the LUNs can be accessed across all of the controller's host ports to enhance data accessibility.
However, this creates problems for Windows 2000, because each LUN will have multiple Target ID (TID) and/or LUN identifier (LUN ID) addresses (see Figure 1).
Figure 3: Alternate path software filters file system access to the Disk Object Group by activating the Primary Disk Object (P) and making it accessible to the file system.
In other words, because Windows 2000 assumes that I/O channels are independent and connect different disks, it will see the same LUN with mul tiple addresses and assume that it has discovered multiple unique LUNs. It will make these duplicated LUNs accessible by the file system. Because the operating system presumes it is dealing with multiple unique LUNs, it will not implement access controls to prevent one application from stepping on another application's data, and data corruption will result.
Alternate path software
A simple solution to alternate path software (APS), which filters duplicate LUNs and provides a level of indirection between the operating system and the data path used to access LUNs. APS is available from a number of vendors.
APS provides multiple-path management by examining LUN Serial Numbers (LSN) to detect duplicate disk objects representing a single LUN. APS constructs a Disk Object Group for LUNs with the same serial numbers. One replicate is designated the Primary Disk Object, and the other the Secondary Disk Object (see Figure 2).
APS then filters file system access to the Disk Object Group by activating the Primary Disk Object and making it accessible to the file system (see Figure 3). This provides a data path to the LUN using HBA 0 and Controller 0, as well as any SAN networking elements, such as switches and hubs that implement the data path.
If an active disk object becomes unavailable due to a path failure, APS automatically switches to the Secondary Disk Object that represents an alternate data path through the SAN to the LUN in the storage array (see Figure 4).
Figure 4: If an active disk object becomes unavailable due to a path failure, APS automatically switches to the Secondary Disk Object (S).
APS uses the SCSI Ping command (Test Unit Ready) to determine path availability. After a path fail-over, APS will ping the primary path at periodic intervals to determine if the path has been repaired. When the failed path has been restored to operational status, APS will automatically fail-back to the original path. APS maintains the context with the file system during fail-over and fail-back operations so path transitions are transparent and non-disruptive to the operating system.
Besides multiple-path management to improve system availability, APS enables I/O load balancing to more efficiently utilize I/O resources and enhance system performance. Most APS products provide a load-leveling mechanism so that each available data path is simultaneously processing I/O operations. LUN count leveling is an economical form of load balancing appropriate for x86-based servers. With LUN count leveling, APS automatically distributes the LUNs evenly across the available data paths. Automatic load balancing reduces system management complexity and lowers costs.
APS products are generally integrated with the operating system reporting errors and events, such as path transitions to the operating system error logs. System managers can use the log information to monitor the status of I/O paths and take remedial actions before SAN components fail. When paths fail, APS automatically alerts system administrators that the system is in a critical condition and vulnerable to losing access to the storage pool.
Kevin Smith is senior director of business management and marketing for external products at Mylex (www.mylex.com), Fremont, CA.