SAN management user case study, part 2

The second installment in our four-part series on how a SAN manager solved his SAN management issues focuses on performance monitoring and cost justification.

Pay for performance: Justifying SAN performance tools

By David Floyer, Wikibon.org

-- Ryan Perkowski, a SAN manager at a major U.S. credit card firm, has 450TB installed. Half of it (225TB) is performance and availability critical. At today's prices for Tier-1 storage, that is a current storage value of less than $2 million on the floor. So why did Perkowski pay $450,000 for Virtual Instruments' NetWisdom SAN monitoring tools?

The method that Perkowski uses to convince his customers is to properly cost out the all- inclusive cost per TB that the end-user pays. The cost to his business customers for mission-critical Tier-1 storage is $60,000 per TB, ten times the purchase price. This includes the costs of backup and recovery, performance and availability assurance, additional copies, the storage network, compliance, storage staff and management tools. Sure, if the project will work with Tier-2 storage, go for it. But the cost of supplementing the services if they are actually required will be much higher for the project team than taking standard storage services. If a performance SAN is required, the cost of the monitoring software as a proportion of the total cost is small.

What are the benefits of SAN performance knowledge? There are four levels of justification:

1. Cost avoidance: This means that storage that was going to be bought to solve a perceived I/O problem that was not actually a storage problem is not bought. In Perkowski's case, he avoided having to upgrade an EMC DMX3 to a DMX4, because the Virtual Instruments probe provided detailed and correlated historical information that showed reducing the block size at the database level would solve the performance problem. This more than paid for the whole installation. By knowing the end-to-end performance characteristics across the SAN he avoided the cost of over-provisioned "just in case" storage.

2. Time to solution: By knowing that the problem was not in the SAN, and by having access to a wealth of information to help identify the server-side problems, projects can be rolled out quicker. In Perkowski's case, the SAN monitoring tools are being used by the database groups to help them implement better solutions and solve problems faster. The price of additional probes is included as part of the development cost.

3. Rationalization of storage software on the SAN components: particularly at the switch and array level. Each SAN component includes device-specific management tools, but end-to-end management should be done with heterogeneous tools that take data from all SAN components and correlate it historically. Perkowski is in the process of removing some device-specific management software and saving a bundle.

4. In Perkowski's case, the SAN monitoring tools show that data growth has gone from exponential to linear, but that access density is going through the roof because the data is being exploited much more heavily. Perkowski is in a position to know that he has to think about high-performance storage architectures that are capable of delivering much higher levels of IOPS/TB than the current architecture can provide. He has the data and charts to show it, and the confidence of senior IT management that there is value in exploiting the data and that there is a justified price to pay for improved storage IOPS performance.

So are end-to-end SAN performance monitoring tools the solution to all performance problems? They do have limitations. They provide a snapshot of the SCSI conversations from HBAs to-and-from disks. But if a component of the SAN does not provide detailed data, those data correlations will be missing. In addition, SAN management tools do not provide an application-level view of performance and whether the storage system is meeting the SLAs for that application. And in a virtual server environment, they do not show the relation between I/Os and virtual machines.

This data is required in the open systems arena. But developing it requires a new management model and new standards. Companies such as EMC are attempting to introduce these models and tools in VMware, and the recent partnership between HP and Microsoft claims to be aiming to solve the same problem.

Currently, it requires an army of experts to solve a deep performance problem in a virtual machine environment. The creation of more proprietary stacks should eventually provide better end-to-end management tools and reduce the size of the army. Eventually, the tools may be automated and eliminate the army altogether.

Action Item: While we are waiting for nirvana, IT storage managers and senior IT management with high-performance SANs would do well to follow Perkowski's philosophy: Focus on a few best-of- breed third party tools for end-to-end SAN management (in addition to Virtual Instruments' NetWisdom, Perkowski uses NetApp's SANscreen software), and use vendor-specific tools for component management. And get rid of everything else.

David Floyer is a member of Wikibon.org.

Related article:
SAN management user case study, part 1: Opening Pandora's box of SAN management


Turning SAN dark arts into light science

By David Floyer

High-performance SANs provide optimal performance for mission-critical workloads and need to be managed as a whole. The switches and storage arrays that comprise the SAN each have data collection capabilities but do not provide an end-to-end view.

The most logical and cost-effective place to tap into the key performance data of a SAN is at the storage ports. Wikibon recommends that high-performance SANs be enabled to collect end-to-end performance data by putting splitters between the storage array ports and the rest of the SAN. Splitters are passive devices that take a percentage of the light from a fibre cable that can be connected to performance data analyzers. The cost is approximately $300 per storage port and should be built in to the purchase and operating procedures for all high-performance SANs.

The steps to fitting splitters are:

1. Determine if multi-pathing has been correctly set up for each port, and ensure that any fail-over infrastructure that should have been there is actually there!

2. Retrofit splitters to all existing storage ports (non-disruptive if step 1 has been done).

3. Set in place purchase and installation procedures and training to ensure that splitters are installed with every new array.

In a recent Wikibon Peer Incite meeting, Ryan Perkowski, a SAN manager at a major U.S. credit card firm, talked about the importance of choosing best-of-breed tools for monitoring the SAN as a whole. For the end-to-end performance analyzer, he chose Virtual Instruments' NetWisdom, which provides software probes that record and help analyze the data from all components of the SAN. For overall SAN management, he chose NetApp's SANScreen software, and for array management he chose tools from the array vendor (EMC).

This article was originally published on January 28, 2010