By Mark Teter
Once storage area network (SAN) vendor evaluations and cost justifications are complete, IT departments must begin the process of providing support and maintenance activities for the new storage infrastructure. The IT group must completely understand how to provide SAN system administration and maintenance without causing business disruptions or outages.
For SAN assurance testing, IT managers must document, test, and validate the storage network configuration. This includes performing end-to-end performance valuations, testing fail-over scenarios, and ensuring interoperability. The best way to begin this process is to develop test suites that fully "exercise" the SAN. Test suites are useful tools because they methodically certify that the SAN operates to vendor promises and the organization's expectations. They also can be used to troubleshoot the environment when it is not performing as expected. Test suites can determine and diagnose where problems exist within the SAN infrastructure.
A test assurance plan is a document that describes the objectives, scope, and focus of the testing effort. The process of preparing a test plan is a useful way to think through the efforts needed to validate the acceptability of the SAN architecture. The completed document will help people outside the test group understand the 'why' and 'how' of product validation. It should be thorough enough to be useful, but not so thorough that no one outside the test group will read it.
The test planning document comprises test suites and individual test cases. Test suites outline the general areas of the testing effort. Test cases are more detailed scenarios that describe an action or event and the generated response in order to determine if the SAN is working correctly. It should contain information such as test case identifier, test case name, objective, test conditions/setup, requirements, and results.
Test suites should verify three important characteristics of the SAN: availability/reliability, scalability, and performance. The storage network should be highly available not only from the data storage perspective (e.g., RAID protection), but also from the overall data delivery schema. The SAN is scalable if it has the ability to grow without requiring sizable up-front investments.
The SAN should have adequate performance to minimize I/O bottlenecks. Performance-related metrics are often a good indication of the overall health of the SAN. Performance testing should answer questions such as: How many disks can be supported behind a fabric port? What is the performance impact of cascading additional switches? Is application load balancing needed? Does SAN segmentation improve backup performance?
Test design approach
In addition, IT managers need honest answers to questions such as: Can my current system handle the load if capacity doubles? Will the proposed architecture handle our production environment? Proper SAN assurance testing helps answer these questions, ensuring the storage network will meet the business requirements.
Another way to deal with these questions is to buy SAN solutions through vendors or systems integrators that perform testing and compatibility checking for you. However, when planning the acquisition and growth of storage solutions, it is difficult to predict the performance and operational needs of your organization. The only recourse is to measure performance, interoperability, and fail-over behavior as it directly relates to your application requirements.
Obviously, the best place to validate these requirements is in a controlled test environment or SAN lab. After the SAN lab is installed and configured, the first step is to document all equipment configurations and software revisions used in the storage network. Table 1 illustrates the detail of documentation that should be recorded. Keeping configuration information is an important aspect of assurance testing, and should be managed through normal IT configuration management practices. This information represents the "recipe" of the SAN, and is an essential part of the maintenance documentation or "run book." It is very easy for organizations to lose control of managing component configurations and software revisions, because vendors are constantly updating their products. It is common to suddenly begin having problems within the SAN due to component configuration changes.
The next step is to develop a test plan. The test plan is used to construct individual test cases that generate performance and interoperability data. The test cases will provide an understanding of the advantages of the SAN architecture, as well as uncover any disadvantages of the design. It is important when developing the plan to use a consistent naming convention for the storage components in order to facilitate the testing process. Through identifying known points of failure within the infrastructure, test cases can be written to validate concerns such as fail-over behavior, device management, recovery processes, performance issues, and scalability limits. Table 2 shows several high-level test suites. An individual test case is illustrated in Table 3.
Assurance testing tools
I/O generators are used to write and read "data" to and from the SAN fabric. Popular I/O generators, or benchmarks, used for SAN testing include Iometer, Iozone, and Postmark. These tools create I/O loads that stress the storage network from an end-to-end basis and help validate operational limits and I/O performance characteristics. A good source for benchmarking tools can be found at http://www.raid-storage.com/benchmarks.html.
The Iometer benchmark was written by Intel. It is an I/O subsystem measurement and characterization tool that includes correlation functionality that assists with performance analysis. Iometer measures the end-to-end performance of a SAN without cache hits. If write or read requests go to the cache on the controller (a cache hit) rather than to the disk subsystems, performance metrics will be artificially high. (For more information on Iometer, see "RAID benchmarking workload analysis" in this issue, or visit http://developer.intel.com/design/serversdevtools/iometer/index.htm.)
Iozone is a file system benchmark tool that generates and measures a variety of file operations. It has been ported to many systems and is useful for performing a broad range of file system tests and analysis.
Postmark was designed to create a large pool of continually changing files, measuring the transaction rates of a large Internet mail server workload. Benchmarking tools offer a variety of capabilities and should be selected based on which tools provide the best I/O characteristics of your application environment.
SAN assurance testing explores performance and operability issues of the storage network, as well as the manageability of SAN components. The purpose of developing and performing test suites is to evaluate performance and operational limits in order to assess impacts from storage-related failures and errors. Ultimately, this information serves as a valuable troubleshooting guide for the IT organization.
Mark Teter is director of storage solutions at Advanced Systems Group (www.virtual.com), an enterprise computing and storage consulting firm in Denver.