Fibre Channel Testing Challenges
Fibre Channel often requires advanced emulation and analysis testing.
By Steven Bucher
As momentum builds behind Fibre Channel storage networking solutions, testing requirements remain an often-overlooked aspect of developing and deploying devices and systems. The complexity of Fibre Channel makes it difficult to predict required testing capabilities, which in turn makes it difficult to select the appropriate test equipment. Choosing the wrong equipment early on in a project could increase expenses and delay product launches.
This article explores some testing scenarios that may not be obvious at the beginning of a project, but could become important later on. By exploring these scenarios, developers should be better able to make intelligent decisions about which test equipment best suits an application.
Much of Fibre Channel`s success comes from the storage industry, so it is not surprising that there are strong parallels between Fibre Channel and SCSI test equipment. Indeed, most vendors of Fibre Channel test equipment also have SCSI product lines. Similarities between a vendor`s SCSI and Fibre Channel equipment may ultimately reduce learning curves and increase productivity as test sequences are shared between the two interfaces.
Like SCSI test equipment, Fibre Channel test equipment falls into two categories: emulation systems and analysis systems. The two tools are complementary and are often used together to solve complex problems.
An emulation system is an active party; it communicates with the devices being tested. In a Fibre Channel-Arbitrated Loop (FC-AL) environment, the emulator has a physical address (AL_PA) and participates in the loop initialization process. With an emulator, the engineer can directly interact with other loop devices, initiating Fibre Channel traffic and viewing the device`s response.
In contrast, analysis is a passive activity. An analyzer is used to record and display a trace of the traffic generated by the various devices in a system. The analyzer consists of two channels that record Fibre Channel traffic. Each channel connection intercepts one of the loop connections of the particular device.
Basic vs. Advanced Testing
At its simplest level, emulation testing uses an emulation system to issue SCSI commands to target devices, to view responses, to control data patterns, and to automate some aspects of the testing procedure. An analysis system is used to capture and display two channels of traffic, to set up trigger conditions, and to search through a trace for specific activity.
However, these basic testing operations are rarely sufficient to ensure the functionality of the devices being tested. Fibre Channel is rich in capability and complexity, much of which must be tested to ensure that devices operate properly. For example, while it is a basic testing function to issue a SCSI command to a disk drive, doing so does not test the drive`s ability to handle the exceptional or unusual circumstances to which Fibre Channel devices are often exposed. The remainder of this article explores a number of these advanced testing aspects.
Advanced Emulation Testing
Illegal frames--A frame contains a large number of parameter fields, some of which convey redundant information. For example, the command frame for a write command specifies an explicit data transfer length and also contains a SCSI CDB, which implies a data transfer length based on the drive`s block size. In a correct and legal frame, these transfer length indications agree. If they do not agree, an illegal frame results. Though illegal frames are rare, it is desirable to test devices to see how they react when they do occur.
To generate frames with illegal contents, the emulation system must provide a mechanism for either overriding default parameters or for creating an entire frame from scratch. A simple emulation code sequence can be constructed to corrupt a portion of the frame payload.
In this example, the first write command is issued without overriding any frame parameters. Before the second write command, the set_rw_dl() function is called to override the read data and write data bit fields and the data length field. In this case, the read data bit is set so that when the write command is issued, the frame parameters are inconsistent and the frame is illegal.
Before the third write command, the read data and write data bits are set, also resulting in an illegal frame. For the last command, the data length field is set to the wrong value, also resulting in an illegal frame.
This basic routine can be easily expanded to include other frame types, other command types, errors in the frame header, and multiple error scenarios. For example, a test routine could be constructed that takes as its parameters the header or payload field to be corrupted and the data pattern to insert into the field. The routine could then be called repetitively by a test program using any number of field patterns to test each field.
Out-of-order data--Unlike their parallel SCSI counterparts, Fibre Channel disk drives are theoretically capable of reassembling data received out of order. While no drives on the market currently have this capability and the current private loop and public loop specifications do not require it, a drive may at some time receive data frames out of their intended order.
Testing how the drive responds requires test functions that provide detailed control over command execution. The basic procedure is to issue a write command where the data length spans at least two frames, to send the data frames out of order with a different data pattern in each, and then to issue a read command for the same data locations and testing the returned data.
If the data is in the correct order, the drive correctly reassembled the data. More likely, the write command will have been rejected by the drive, indicating that out-of-order data will not be accepted.
To perform this routine test, the emulation system must be able to execute commands by directly controlling frames and primitives. Since the data frames are sent out of order, the test sequence must initialize the payload content and the frame header with specific parameters. The payload content must be unique to each data frame, and each frame header must be initialized so that header parameters SEQ_CNT (sequence count) and RLTV_OFF (relative offset) properly specify the sequence order of the frames. Since the data frames must be directly controlled, the entire write command must be executed in a similar fashion, with the test routine directly controlling the primitives (OPN, CLS, etc.) and the other frames involved in the command execution (CMND, RSP). A C code listing for a sample routine that performs this test is available at ftp.i-tech.com/infostor/listing1.c.
One variation of this routine would be to initialize the header of the second data frame to match the first data frame to determine how the target reacts to receiving the same segment of data a second time during the same command. Knowing how a drive reacts to these situations is important when attempting to integrate drives into a system.
Advanced Analysis Testing
Device filtering--Protocol analysis in an arbitrated loop environment is complicated by the fact that multiple devices share the same wiring. Every device in the loop must pass traffic bound for other devices in the loop, which means an analyzer connected to record the traffic of a particular device also records all the activity passing through that device, not just the traffic in which the device is participating. All of this additional traffic is irrelevant, and with Fibre Channel`s 1Gbps bandwidth, it does not take long to fill even the deepest trace buffer with irrelevant information.
The typical solution to this problem is device filtering. By configuring the analyzer to record only traffic containing the device address of the device in question, the resulting trace contains only relevant information and none of the traffic that is being passed on to other devices.
Unfortunately, not all device-filtering schemes are effective in every situation. For example, device addresses can change dynamically--addresses are assigned to each device while the loop is being initialized. If the loop is initialized while the analyzer is running, the device may acquire an address that is different than its previous address. The resulting trace contains the traffic of one device before the loop is initialized and that of a different device after initialization.
While loops can be initialized at any time, the process often occurs after a new device is hot plugged into the loop. If a test engineer expects to do a significant amount of hot-plug testing of an arbitrated loop, it may be beneficial to use equipment with a device filtering scheme that can track address changes across a loop initialization.
Event monitoring--When testing devices or systems, it is often desirable to measure a number of performance parameters in real-time. These measurements can be used to verify proper operation, or even to discover problem areas. For example, a systems engineer may detect a problem by comparing an expected ratio of read operations to write operations to the actual ratio.
One way to perform this type of testing is to use an analyzer to trace the desired traffic and then examine the trace for the events of interest. By using a sufficiently long trace, a statistically valid sampling of the number of events over a period of time can be extracted. The same trace can be searched for any number of events, keeping a count of those that are interesting.
While this approach is effective, there are several drawbacks. First, it can be extremely time-consuming to manually scan the trace for specific events. Second, the resulting data is not displayed in real-time. Any transient anomalies in the data will only be seen if they happen to occur during the trace capture. Fortunately, most Fibre Channel analyzers have an event-monitoring mode, which counts a number of event types and provides some measure of their frequency in real-time (see figure).
In this example, the user has defined four events to be monitored, and the analyzer monitors those events and displays a bar graph of their frequency in real-time. The chart to the right of the bar graph lists the total number of counts, the number of counts per second, and the percentage of total events that are attributable to each defined event.
The result is a combination of real-time monitoring and numerical data, providing visual and statistical information about the nature of the Fibre Channel traffic.
Fibre Channel`s complexity requires advanced testing capability. By selecting tools appropriate to the task, engineers can minimize the amount of effort required while still performing sophisticated test procedures. Careful assessment of testing requirements early in a project helps ensure that the selected equipment will suffice for the life of the project.
Analyzers with event monitoring count event types and display frequency in real-time.
Steven Bucher is president of I-Tech Corp., in Eden Prairie, MN.