A service level agreement (SLA) is the key to a successful storage service provider (SSP) relationship.
BY ROBIN CERVAR
"Prove it." That's the mandate storage service providers (SSPs) face when approaching an IT organization with a proposal to provide managed storage services. After all, continuous data availability and reliable data transfer are critical to any organization. Data inaccessibility affects online customers, investors, internal employees, and the IT team. Consider how many dollars are lost-or never earned-when data is not available.
The hackneyed phrase, "Data is an organization's most critical asset," is both true and irritating. Because data is critically important, some IT organizations are overcoming the reluctance to turn to an outside party to manage data storage. Since storage management is complex and time-consuming, and skilled IT professionals are a scarce resource, an SSP may be the solution.
Patrick Laughran, chief technology officer with TechTarget.com Inc., a network of 23 industry-specific IT Websites, says, "Complex RAID management and backup require a seriously geeky Unix administrator. We found labor to be a real issue and started looking at SSPs." TechTarget.com uses an SSP for multiple services, including storage area network (SAN)-based primary data storage and online ("hot") backup. "We use SAN technology to address different needs like database applications, Website, servers, and internal operations. It makes a huge difference not having to worry about the labor involved with managing storage, adding capacity, and backup administration," Laughran adds.
To deploy a robust storage infrastructure, run regular backups, design a disaster-recovery plan, replicate data, ensure high availability, optimize network performance, and accommodate growth in capacity requirements take experience, time, and people. Analysts predict IT organizations will increasingly turn to storage outsourcers once the SSP model is fully established and proven. Cambridge, MA-based Forrester Research expects that by 2004, the world's 100 largest companies will require data storage capacity in excess of 150TB. Managing an infrastructure of that capacity is a huge undertaking and could effectively usurp all the effort of an organization's IT team. Consequently, more organizations will capitalize on SSPs and reserve IT staff for core business projects.
All of which leads to some questions: What will determine the success of one SSP over another? And how can IT organizations differentiate SSPs? In large part, the answers lie in the service level agreement (SLA) and monitoring software.
Once reserved for determining service metrics in the application service provider (ASP) market, SLAs are becoming increasingly critical in the emerging SSP market. The SLA enables an IT organization to measure the performance of an SSP and is the benchmark against which the SSP will pass or fail the "prove-it" challenge.
An SLA defines the ongoing level of service customers should anticipate and the expectations they will have. The process of building the SLA affords both the SSP and the customer to conclude whether the expectations are realistic and satisfactory. Once the metrics are analyzed and agreed on by both sides, the SLA seals the relationship.
In addition to primary data storage, some SSPs offer a variety of services such as backup and restore, data replication, and rich content media distribution. The SLA should account for the characteristics of each particular service but should also consist of a standard set of metrics.
Above all, the SLA should instill absolute confidence that the SSP will deliver the agreed-upon services and metrics-and will react immediately and provide remedies should there be any unpredicted degradation in services.
The SLA should define the level of service specific to each managed storage service. Back-up-and-restore services, for example, might include the schedule and frequency of backups, timeframe and processes for initiating and completing restores, and tape media retention policy. Clearly defined service levels render the SSP measurable and accountable.
Negotiating the SLA
"We wanted our SLA to be bound to availability and redundancy," says Laughran. "We also wanted to be certain the remedies were severe enough that the SSP would feel the pain if our services weren't delivered as agreed."
Give-and-take will occur on both sides of the negotiating table. The SSP must be up-front about the kinds of services it can guarantee with confidence. Also, the customer must be prepared to accept that 100% uptime cannot realistically be promised due to any number of unusual events that neither the SSP, data center, nor customer can influence. If deviation from the SLA occurs, the SSP should have an established process to ascertain whether the request can be delivered and guaranteed, regardless of the time needed to find the answer.
Be wary of an SSP that comes to the table with an unhesitating "yes" to every request. Realistic expectations have to be set, and an SSP should not always make promises without first consulting the engineers and architects who will design and deploy the storage infrastructure. Those are the people who know what will or will not ultimately work.
Once the SLA is drafted, both the SSP and the customer should put it through rigorous review: the customer to verify that the services and processes are ideal, and the SSP to verify that the guarantees are deliverable and realistic. The process might seem detailed, but it's much less daunting than researching, purchasing, and piecing together a storage infrastructure and dealing with multiple vendors.
Monitoring the SLA
Assuming the services have been decided and the SLA is in place, how can users monitor the services? How can users verify the SSP continuously delivers everything specified in the SLA? In addition to regular reporting, the answer is through storage services management (SSM) software.
Three factors will help differentiate an SSP from the others: unique software, a proven track record, and the SLA. SSM software enables SSP customers to monitor their storage environment with an end-to-end, unified view of the systems, network, storage devices, and topology. But the software should also provide users with a portal to monitor and manage the SSP itself and to verify that the level of service guaranteed in the SLA is being delivered.
To be both storage and services management software, the SSM utility should include functionality that enables users to:
- Interactively request service changes and upgrades;
- Review current and historical service information;
- Communicate provision requests;
- Receive backup-and-restore reporting and notification;
- Receive service status audit logs;
- Modify backup schedules; and
- View and modify the SLA.
SSM software can ease an organization's concern about handing over critical data to an SSP for management because it keeps the user connected and in control. The words contained in an SLA are guaranteed, but software that allows users to manage and monitor those words can go a long way toward gaining the confidence of the IT organization.
Exponential data growth, scarcity of IT professionals, and growing complexity of networked storage technology are making the case for SSPs. SLAs and the utilities to monitor and manage them will be key elements in a successful strategy. The SLA is the key to holding the SSP accountable for the services it provides, and SSM software is the way to measure that accountability.
For more information on storage service providers, see the Special Report in this issue (p. 20).
Robin Cervar is a technical writer for StorageNetworks Inc. (www.storagenetworks.com) in Waltham, MA.
IT checklist: Key SLA questions
- Does the SLA detail the customer's computing environment? The SLA should take into consideration the complex mosaic of today's distributed computing environments.
- Does the SLA provide an acceptable level of change management? The SLA should specify the timeframe and procedures necessary for the SSP or the customer to modify the existing service. If the SSP needs to make a change to the environment, there must be a defined process to coordinate timing and procedures with the customer. Typical changes might include the addition of servers, new operating systems, and new applications.
- Will the change management procedures outlined in the SLA work with the present environment? Modifications to the service could impact availability, server allocation, and application performance. The customer and the SSP should consider change management procedures with respect to the current environment.
- Does the SLA include procedures for changing the SLA? There should be exact procedures and timelines for the customer or the SSP to initiate changes to the SLA.
- Does the SLA provide remedies? Exactly how the SSP will make up for any degradation in service (e.g., uptime and capacity) must be spelled out in the SLA. Degradation in service must also be clearly defined to eliminate ambiguity.
- Is there a process for event notification? This section clarifies the timeframe and method for the SSP to contact the customer during an event that could lead to an unscheduled outage. It should also specify guidelines if the customer becomes aware of a similar event. Status reports concerning the event should continue until resolution.
- Are escalation procedures and customer support clearly defined? The SLA should denote the roles of various departments and personnel in delivering and managing the services. This section should also include the procedures for escalation in the event of a problem.