A user/consultant offers his views on how storage resource management (SRM) software can help you get a grip on spiraling management costs.
By David Czech
Shrinking budgets and staff, coupled with rapidly growing capacities and data-availability demands, are forcing storage administrators to manage ever-increasing volumes of storage with fewer resources. To accomplish this task, storage managers are turning to storage management software to help automate repetitive, labor-intensive, error-prone tasks in order to make the task of managing storage easier and more efficient.
Three of the most helpful categories of storage management software include storage resource management (SRM), storage area network (SAN) management, and storage virtualization. In a previous article (see "User view: Storage management software, part I," InfoStor, November 2003, p. 34), I defined and gave brief overviews of each of these software categories and the potential benefits for storage administrators. This article delves into the SRM part of the puzzle.
In many cases, SRM should be the first piece of storage management software to implement. The main reason for this is that it allows you to discover many details about your data and storage resources. It also lets you easily classify the pieces of data into categories such as mission-critical, essential, important, nice-to-have, archivable, or—possibly—harmful-to-have.
Classifying the data correctly will enable you and your staff to create better policies and to make better decisions about what to do with the data, where to store it, and whether to simply delete it. Mission-critical data should typically be stored on the most highly available, highest-performing storage resource available, while other categories of data should be stored on less-expensive disk arrays, network-attached storage (NAS) filers, tape, or in some cases, just deleted (e.g., .mp3 files). Sometimes there will be exceptions or subsets of the data classification categories, such as mission-critical data that is seldom accessed or read-only.
In outline form, the key functions and benefits of SRM are:
- Data classification
- Classify data into appropriate categories depending on:
- Access patterns
- Data criticality
- Data ownership
- Age of data
- Classify data into appropriate categories depending on:
- Capacity reduction
- Reduce current capacity
- Slow capacity growth rate
- Enable (or simplify) implementation of chargeback mechanisms
- Capacity planning
- Accurately predict growth of various classifications of data
- Growth reduction
- Continually monitor for unused data
- User behavior modification
- Help implement more-stringent storage policies
- Automated archiving
- Basic hierarchical storage management (HSM) functionality
A second reason that SRM tools should be high on your list is that they are often more easily cost justified than other storage management tools. The data collected through SRM tools should allow you to prove to project leaders and business unit managers that they really don't need to keep so much data online. Maybe some of their data should be archived because it is seldom accessed, or maybe it should just be moved to slower, cheaper storage devices. How important is instant access to a set of drawings that hasn't been accessed in six months or more? Should data in home directories belonging to former employees still be stored online? If certain documents are seldom accessed and never changed, it is probably more appropriate to store them on less-expensive (possibly read-only) media.
SRM software allows you to easily discover duplicated, obsolete, unnecessary, or temporary files so that these files can be migrated to cheaper storage devices, archived to tape, or deleted. Using SRM tools, most organizations can easily free up 25% to 50% (or more) of their online disk storage space by archiving old and seldom-accessed files and deleting "contraband" files such as games or .mp3 files. How long could you get by without having to purchase additional capacity if you could move, say, 35% of your data off primary storage arrays and decrease the future capacity growth rate by a factor of 35%?
Is 35% a reasonable estimate? This is largely dependent on past policies, work models, etc. You'll have to decide for yourself, but in most environments I have seen that 35% may actually be a very conservative estimate.
Justifying the cost
Cost justification for SRM tools is usually relatively simple. Let's take a typical example of a 35% reduction in current online storage requirements. The chart on p. 44 compares the anticipated online growth of storage capacity for 36 months, growing at a rate of 1/12 per month (100% per year), to the growth of an environment with a 35% reduction in current capacity requirements and growth rate.
Obviously, reducing your growth rate to 65% of its current rate would result in significantly lower total storage capacity requirements. In 36 months, this operation would only grow to about 5TB of online capacity, compared to about 15TB without the 35% reduction in capacity requirements.
In addition, reducing online capacity requirements will begin reducing your storage maintenance costs almost immediately.
The first area to realize costs savings will probably be in backup operations. Less time will need to be spent repeatedly backing up unused or unnecessary data; less tape media will need to be purchased to contain the backups; less labor expense will be spent on monitoring and maintaining backups and changing out tapes; fewer tapes will need to be purchased; and fewer tapes will need to be stored in off-site facilities. More time can be spent backing up and maintaining critical data. Tape throughput and capacity demands will be decreased, allowing you to delay expansion in your tape backup infrastructure. Depending on your software-licensing model, you may also be able to realize licensing savings by backing up less data.
Further cost savings will be realized soon because you will be able to delay purchases of additional disk arrays and interconnect hardware (e.g., SAN switches), along with the management software and personnel required to maintain and operate these devices. If you can free up 35% of your space, it is at least as good as (and probably far better than) purchasing new capacity to meet storage demands. If your growth rate is also slowed by 35%, future savings can be substantial.
Of course, with a decrease in overall storage capacity comes a decrease in storage management and hardware maintenance costs, which can be significant as your storage infrastructure increases in complexity. Training existing staff on new technologies, while increasing salaries to retain them, can be very expensive and most CFOs would love to see these expenses reduced.
How long would it take to justify the purchase of an SRM solution? To make it even easier, some vendors offer money-back guarantees. Most of these guarantees state that if you implement the vendor's SRM software you will be able to reduce online storage capacity by some percentage—usually somewhere between 25% and 50%. If you haven't already implemented some form of SRM, I would advise doing so as soon as possible.
How does SRM work?
SRM tools generally look similar and function in similar ways. They gather information on what is being stored, and where it is being stored, from various servers in your environment. They store that information in a central database or repository, most often through the use of agents installed on each of the servers.
There are a variety of types of information that can be gathered, depending on which vendor's SRM tool you are using. Every SRM tool should make it a fairly quick and easy process to gather file-system capacity data. Most tools also collect important file-system metadata, including details such as file owner, last access time, last backup time, etc. Some SRM tools keep track of additional information, such as frequency of access, user ID of last access, etc.
Once this data is collected, you can run a variety of reports included in your SRM software or devise your own reports to slice and dice the data in whatever way is most useful to you. These reports can be used to let you know who your biggest data storage users are, what types of files they are storing, how often these files are accessed, as well as to produce capacity and usage data to feed into chargeback systems (or other automation processes).
The data collected by SRM software can be especially useful for determining if your storage is being used efficiently or if you should be migrating old data to less-expensive storage devices (which can be accomplished with some SRM tools).
Most SRM software includes features such as the ability to kick off scripts if certain thresholds are reached (for example, deleting all .mp3 files if file-system utilization reaches 85%).
Some SRM tools provide much more automation than other packages. The more feature-rich SRM tools allow administrators to create policies that automatically take various actions when certain conditions are met. These policies and actions could be complex, such as initiating a backup to tape and subsequent deletion of all files that haven't been accessed in the last, say, 30 days.
Another automation feature that some of the SRM tools include is the ability to provision new storage to meet demand for growing file systems. Some of the SRM tools can interact with operating systems, file systems, and storage devices to dynamically grow capacity so that the file system will not run out of space.
Of course, you could implement some of this functionality manually. For example, you could set up manual or batch processes to collect most of the file metadata, then ask the owners of various projects and business units to tell you which pieces of their data fit into which categories. But do you think they really have a handle on that? Do they keep track of the data access patterns? More likely, they can only make generalizations and predictions of how much capacity they will need, and they are usually extremely reluctant to relinquish storage capacity once it has been allocated to them. Unlike business unit leaders or project owners, SRM tools do not have political or business aspirations. They report only on what they actually see, never padding the results just to make sure a "little" cushion of capacity exists for some specific project or another.
SRM software should be the starting point of any effort to bring order to the storage management chaos. Without the information gathered through SRM software, it will be difficult for storage managers to really understand, sort out, and prioritize all of the various pieces of data within their storage infrastructures. Without this information, it will be almost impossible to even come up with appropriate policies for data storage, justify them to management, and implement and maintain them.
David Czech is a senior storage engineer and principal with Distributed Office Environments, a storage management and disaster-recovery consulting firm based in Denver. He can be contacted at firstname.lastname@example.org.
Representative SRM vendors