Service level agreements are the key to information life-cycle management (ILM) and tiered storage.
By Mike Drapeau
Much continues to be written about the need to better understand, design, and implement information life-cycle management (ILM) and tiered storage solutions. Most of the major storage and system vendors provide evolving hardware, software, and services designed to enable users to better match infrastructure with business needs. Service level agreements (SLAs) are the glue that holds together a tiered storage environment and enables ILM.
At first, SLAs seem to be a straightforward concept. First, you establish some basic levels of service and give them a name—platinum, gold, silver, and bronze, for instance, each representing a bundled set of features. Organizations then evaluate their current storage hardware, software, and services to determine how they can be allocated among the different levels. There are different models to reflect this graduated approach to storage functionality but most of them follow a pattern:
Platinum—This service level typically includes enterprise arrays with high-availability, performance, and scalability attributes (e.g., EMC's Symmetrix DMX, Hitachi's Lightning), remote replication features for a rapid recovery-time objective (e.g., IBM's PPRC, EMC's SRDF), and other value-added features such as internal disk hot splits (with products like IBM's FlashCopy or Hitachi's ShadowImage). Only the most mission-critical applications, such as ERP and other revenue-dependent applications, receive this costly storage service level.
Gold—This service level encompasses mid-tier or modular storage arrays (e.g., EMC's CLARiiON, HP's EVA, IBM's FAStT, and Hitachi's Thunder) that provide reasonable levels of performance, scalability, and connectivity. In terms of functionality, these arrays are less robust than enterprise arrays. Typical "mission-important" applications that merit Gold service include e-mail and data warehousing.
Silver—This service level can be met by arrays with ATA disk drives, which are less expensive and have higher capacities than the Fibre Channel drives found in most storage area networks (SANs). Typically, vendors offer the same software functionality for ATA-based arrays as they do for Fibre Channel arrays. Therefore, the key difference in the Silver and Gold service levels is that ATA disk is cheaper per megabyte but has lower availability and perfor-mance. Typical applications in the Silver storage level include disk-based backup, archiving, file systems with large objects, and other unstructured data.
Bronze—This can be a "whatever's left" category, where the remaining storage options on the data-center floor are grouped together—internal disk, JBOD (just a bunch of disks) arrays, or network-
attached storage (NAS) appliances can all fit the bill. Users should have limited expectations for performance, availability, and other value-added storage functionality. Applications that fit this category include application development, testing environments, and file shares.
So far, we have discussed only how disk arrays support various service levels. In fact, the whole technology stack—networks, servers, databases, and backup policies—must be subjected to the same discipline. This is a daunting task for administrators trying to plan and roll out an SLA-based infrastructure. But the biggest challenge to implementing SLAs is not the complexity of the technology, but the non-technical obstacles that can defeat the most well-intentioned SLA approach.
Storage SLAs can get off-track for a number of reasons, including
- SLAs with generic measurement criteria;
- The people measuring SLA accomplishment are the same ones providing the service;
- SLAs that measure criteria of no value to the business;
- Inadequate tools to collect the data to compare to the SLA criteria; and
- Lack of internal operating procedures that reflect a service-level management approach.
But these are only the initial challenges. The remaining roadblocks to establishing storage SLAs can include the following:
- Lack of an asset management system (AMS) to handle storage chargebacks;
- Storage software licensing schemes that prevent tiered service-level implementations; and
- Inability of application "owners" to change service tiers and reap cost benefits.
Why an AMS is critical
At first glance, it seems reasonable to group infrastructure capabilities around different tiers and then to develop an internal "contract" that stipulates how IT can deliver services to match each level. When presented with the opportunity to receive more in the way of infrastructure services, but at a greater cost, line-of-business (LOB) managers sometimes request "Platinum" service but without recognition that it comes at a substantial premium. This reflects a relatively common practice of treating IT as a "cost center" only and, therefore, one that only shows up on the red side of an LOB profit and loss statement. Most LOB managers lack a financial vehicle to determine the internal cost of the actual infrastructure they are using. Some shops use the number of desktops as a tool to allocate costs—but that is more appropriate for networking than storage/server platforms.
The connection between infrastructure and P&L can be made via a chargeback function within an overall asset management system (AMS). An AMS can be a robust software system provided by vendors such as Computer Associates and Peregrine, or it can be a simple and blunt approach using spreadsheets and manual entry. The key is that infrastructure costs are allocated based on application usage and type—not number of desktops. Examples of a standardized process for implementing this can be found at www.assetmanagement-toolkit.com.
Service levels vs. pricing
Another impediment to effective implementation of storage SLAs is in the software licensing schemes offered by the major storage vendors. Currently, customers are typically charged for software based on the amount of capacity in the frame, even if the customer is using the software for only a fraction of the capacity in the array. For instance, if a customer wants to make use of third-mirror software (e.g., TimeFinder or BusinessCopy) for a 2TB application, he/she will need to pay the license fee based on the total amount of capacity in the frame, which could be many times the size of the 2TB application.
This effectively precludes using a storage array for multiple tiers of service and makes it more difficult and less flexible to implement SLAs in a shop that has only one or two arrays. Customers must ensure that, prior to purchasing an array, they negotiate an exception to this pricing scheme or else they will be unable to support multiple service levels on a single storage array.
Changing service levels
One of the key principles in tiered storage and SLAs is that LOB managers can decide to "promote" or "demote" their applications to new service levels and receive a different chargeback amount based on that change in service level. This common-sense approach, however, runs into implementation obstacles based on the lack of liquidity in infrastructure costs, as well as the difficulty in migrating applications from one storage tier to another. The problem is that infrastructure costs are largely fixed and amortized (in the form of a lease or depreciation) over at least three years.
If an LOB manager decides that an application no longer merits Platinum service, the costs are still sunk and must be allocated elsewhere. Even if it were possible to re-allocate these costs to another LOB manager, the process for moving an entire application from one storage array to another is not trivial—especially if advanced functionality has been scripted into the application. Many parameters, functions, and routines need to be changed. Invariably, there will be some disruption of service and possibly an outage. This work can be accomplished over time, but it is frequently difficult and there is no single tool to govern the process.
We have identified some of the difficulties in implementing storage SLAs and a tiered-storage approach to infrastructure. We have raised these issues not to discourage the trend toward SLAs, but to ensure that IT managers are realistic about the challenges they face. By understanding these obstacles, IT staff can take steps to overcome them and establish conditions for successful storage SLAs.
Mike Drapeau is president of The Drapeau Group (www.drapeaugroup.com), an Atlanta-based consulting firm specializing in strategic development, platform architecture review, and issues such as regulatory compliance assessment.