Building an effective storage environment is a balancing act between performance, reliability, and efficiency. For years, improving storage efficiency took a back seat to the other goals. Now, with infrastructure budgets being cut and data growth continuing unabated, it’s time to take a closer look at storage efficiency to see if there is room for improvement.
Most storage teams don’t have rigorous processes in place to measure key performance indicators (KPIs) of storage efficiency, and this hinders the ability to make improvements. Think of these KPIs as an essential health check — they tell you what the problem is and how severe, but solving the problem will require long-term rehabilitation and fundamental changes to daily routines. Every storage team should periodically evaluate their environment based on these KPIs to determine where they sit in terms of efficiency. Some of the key metrics for storage efficiency include the following.
• Utilization is a measure of how much storage capacity is wasted. Utilization is probably the single most important metric of efficiency, and also the least consistently measured by storage teams. There are many ways to measure capacity, including raw, usable, allocated, consumed etc. Some of these are easier than others to measure, and each can shed different light on utilization efficiency, but looking carefully at utilization will be the first step in measuring and improving storage efficiency.
• Tier distribution measures the cost effectiveness. A given tier of storage service is a combination of network interface, class of storage array, RAID level, drive type, and other features. Given these combinations, there are many ways to build a given quantity of usable storage capacity to meet the needs of business and application owners. Measuring your tier distribution is the way to know if you have too much or too little performance and availability built into the environment. Many storage teams rely on too much high performance, highly resilient tier 1 storage for the sake of consistency, while many of the applications and file storage actually require a less expensive tier of storage.
• Scale up per array measures the level of consolidation. The first terabyte (TB) you buy is always the most expensive in the array. The first TB requires an investment in the frame, cache, controllers, and network interface cards of the array itself. Once you’ve made this initial investment, adding capacity to the array is much less expensive. The more you can build out each array towards its maximum scalability point, the more efficient the environment will be in terms of acquisition cost and management cost. Many arrays today can scale to hundreds of TB, but most firms don’t take advantage of this scale, driving up their cost of operations.
• The number of vendors drives complexity. Firms have heterogeneous environments for a wide variety of reasons. Some have inherited gear from acquisitions. Others have allowed individual departments to make their own storage product selections. Some look for best-of-breed products for each individual area, and some want to increase their negotiation power by playing vendors against each other. While these may be valid reasons, the result of too much heterogeneity is high management complexity and reduced efficiency.
• Staffing levels can uncover inefficiencies. Measuring how much staff is required to run your environment can tell you a great deal about how efficient you are. Whether you are insourced or outsourced, you use internal people or contractors, keeping a close eye on how many people it takes to deliver storage services can uncover inefficiencies. You could lack the appropriate storage resource management (SRM) tools for a large environment; heterogeneity could be driving up the management complexity; and it even could be that your storage administrators need more training. Whatever the reason, given that the fully burdened cost of an experienced storage administrator is typically $100,000 or more per year, it’s in your best interest to watch staffing levels.
It’s not easy to get a handle on these KPIs, and in many cases there is no single SRM tool that can give you all the information you need to do it effectively. Benchmarks are hard to come by as many firms don’t measure these numbers consistently, and those that do are loathe to share their results outside their walls.
However, think of this as an ongoing process, where you can get an internal baseline for where you are today so you have a reference point against which you can measure how your environment changes over time. To be successful, make sure to be consistent in measurement, evaluate possible impact of environment changes before moving forward, and look at metrics both for the whole environment and for key subsets such as regions, storage tiers and application stacks to identify trouble spots and areas of effectiveness.
Andrew Reichman is a senior analyst at Forrester Research, where his research for infrastructure & operations professionals focuses on data storage systems, networking, and management software. Andrew will be speaking at Forrester’s IT Forum, May 26-28 in Las Vegas.