The emphasis is shifting from backup to recovery, and technologies such as CDP can help users meet RTO/RPO goals.
By Heidi Biggar
—It may have taken years, but the data-protection market is finally in a state of transition. Although end users are still looking for ways to improve backup performance (the backup window remains the number-one storage-related issue among IT organizations today), they are focusing increasingly on recovery.
Faced with growing data volumes, a seemingly endless list of regulatory compliance and corporate governance mandates, and stringent service-level agreements (SLAs), users are looking for better, faster, and more granular ways to recover data. The failure to do so can have potentially costly implications for organizations in the event of a disaster, or some other type of outage, or in e-discovery situations.
The Enterprise Strategy Group (ESG) believes pent-up demand for recovery-focused products, such as continuous data protection (CDP), replication, and snapshots will make 2006 a pivotal year for the data-protection industry, as vendors jockey to position recovery technologies into cohesive data-protection strategies and look for ways to integrate these products into traditional IT infrastructures (e.g., backup or replication platforms).
In particular, users will see an influx of CDP and "near-CDP" products, as well as a wave of new data classification products. CDP helps users rapidly re-create and restore data (theoretically to any point in time), while data classification makes both backup and restore more efficient by enabling users to characterize data according to type, age, structure, etc., and then assign backup-and-recovery resources accordingly.
ESG also expects topics like security, single-instancing (or de-duplication), direct restore, and compression to resonate among end users this year.
What's driving the shift?
The focus on recovery is the result of several converging forces: 1) an overall change in mindset among end users about data protection 2) a wider availability of recovery-focused options, 3) market validation of recovery trends by big-name vendors such as Microsoft, 4) the maturity of the disk-based backup market, and 5) increasing data volumes.
Change in attitude: End-user interest in recovery-focused technologies directly correlates to increasing levels of federal and corporate scrutiny. As a proof point, more than half of enterprise-class organizations in a recent ESG survey say they have been involved in a legal proceeding or regulatory inquiry that necessitated the search for and retrieval of electronic records. (For more information, see the Digital Archiving: End-User Survey & Market Forecast: 2006-2010 report at www.enterprisestrategygroup.com.)
As regulatory compliance and corporate governance demands increase, so too do organizations' data-recovery requirements. Lengthy restoration processes can have costly ramifications. In fact, in another ESG user survey nearly a third of the organizations said they would experience significant revenue loss or other business impact within one hour or less of application downtime.
Product availability: More products mean more options for users, allowing them to choose products that meet their specific requirements and price points. It also allows users to create tiers of data protection. Similar to information lifecycle management (ILM), data-protection lifecycle management (DPLM) matches backup technologies to the value of the data during in its lifecycle.
Market validation: A year ago, the only vendors making noise in the CDP space were a handful of start-ups (e.g., Mendocino, Revivio, and TimeSpring). Now, virtually all of the leading vendors have CDP or "near-CDP" products.
Disk-based backup maturity: Disk-based backup (e.g., nearline disk, disk appliances, virtual tape libraries, etc.) has reached a level of maturity where its role in the data center is no longer questioned. Disk-based backup vendors will continue to enhance products with an emphasis on data reduction (or de-duplication), compression, and cost reduction. Serial ATA (SATA) drives made disk-to-disk backup possible (providing a low-cost alternative to tape-based backup), and SATA-based tape may drive the overall cost-to-entry for a complete disk/tape backup/recovery solution even lower.
Increasing data volumes: Data volumes continue to rise, making it almost impossible—especially for large IT shops—to perform backups in a reasonable timeframe. Incomplete backups spell potential disaster from both a legal and business standpoint.
CDP up close
Clearly, CDP means different things to different vendors. To some vendors, it means the ability to make frequent snapshots; to others, it is the ability to have an up-to-the-minute copy, or snapshot, of data at very granular levels for restore purposes.
ESG defines CDP as "a software- or appliance-based solution designed to capture every write to primary storage and make a time-stamped mirror on a secondary device. The objective is to be able to rapidly re-create and restore data as it existed at any previous point in time."
Although ESG believes CDP in its purest form is defined by "its ability to restore or recreate data—at very granular levels—to virtually any point in time, we also believe "near-CDP," or "snapshot repository," products will play a key role in tiered data-protection environments. "Near-CDP" products provide greater recovery granularity than traditional backup methods (they allow users to take frequent snapshots of data), but they do not provide recovery to any point in time like "true" CDP products do.
Both CDP and "near-CDP" provide significantly better data protection (in particular, restore) than traditional backup-and-recovery methods.
For example, while a traditional backup-and-recovery implementation may have an average recovery point objective (RPO) of 12 hours or more and a recovery time objective (RTO) of 4 to 24 hours, depending on the size of the recovery, CDP and "near-CDP" recovery methods have the potential to reduce RPOs to minutes or hours and RTOs to less than an hour. RPO is the amount of data loss a company can tolerate (e.g., from a 12-hour-old copy of data or from a 2-minute-old copy), and RTO is the time it actually takes to get to the recovery point (i.e., how fast a company needs to find lost data and restore applications).
CDP and "near-CDP" provide users with another tier, or level, of data protection. As an example, a user could keep 24 hours of real-time data on tier-1 recovery disk that is protected by CDP; 2 to 10 days worth of data on tier-2 recovery disk that is protected by "near-CDP" technology; 10 to 60 days worth of data on tier-3 recovery disk; and 61+ days worth of data on recovery tape (see table below). The assumption here is that a significant number of recoveries are performed within 24 hours of data creation. ESG research finds that more than one-third of recovered data is less than 24 hours old and more than a half is less than two days old.
ESG does not expect end users to "rip and replace" existing data-protection infrastructure but, rather, to implement CDP/near-CDP in phases on an application basis to supplement existing backup-and-recovery solutions to better meet data-protection SLAs.
CDP and "near-CDP" can be executed in a variety of ways: as a discrete device (e.g., EMC, Revivio, and StorageTek), as a component/feature of backup (e.g., Asempra, Atempo, Avamar, CommVault, LiveVault, and Veritas), in replication software (e.g., EMC, InMage, Kashya, and XOsoft), or as stand-alone software (Asempra, FalconStor, FilesX, IBM, Mendocino, Storactive—now part of Atempo—and TimeSpring).
The bottom line
For years, we've known that "a backup is only as good as the recovery," yet the industry has done little to improve the recovery piece of the data-protection equation. The thinking (erroneous as it was) was that if data were backed up regularly—say, nightly—there would be a backup copy from which to recover if it were necessary. Of course, it might take several days to locate the copy of the file you needed, but it would be doable.
Today's litigious climate, regulatory compliance, and corporate governance place new demands on users in terms of the sheer volume of data that must be protected and ultimately made recoverable. In particular, users face much more stringent RTOs/RPOs than ever before.
New recovery techniques, such as CDP and "near-CDP," and granular search/indexing capabilities, along with existing tape- and disk-based backup technologies, will go a long way in ensuring data is recoverable when it needs to be—at a price point that makes sense.
ESG refers to this type of data-protection process, one that aligns backup-and-recovery technologies (e.g., tape, disk, and CDP) with data recovery requirements, as data-protection lifecycle management, or DPLM.
Heidi Biggar is an analyst with the Enterprise Strategy Group research and consulting firm ( www.enterprisestrategygroup.com).