User survey: Data protection, CDP, snapshots

Adoption of continuous data protection is rising rapidly, and CDP is not used just for data-protection applications.

By Farid Neema

Data protection has been the weakest link in the high-availability chain that defines true business continuity. A major difficulty lies in the protection and easy recovery from viruses, accidental file deletion, and software corruption-failures that may not be detected immediately. The challenge is being able to determine the last instant before data loss and then being able to restore to that point.

Click here to enlarge image

For the fourth-consecutive year, a majority of respondents in a survey conducted by Peripheral Concepts rank data protection highest among their storage management challenges, and data recovery stands out as the major data-protection-related problem. In addition, recovery time objective (RTO) and recovery point objective (RPO) are becoming increasingly important to end users at companies of all sizes.

Click here to enlarge image

Snapshot techniques and continuous data protection (CDP) can resolve many of these problems. Snapshots are point-in-time (PIT) images of an active live volume, which can be created nearly instantaneously. Snapshot volumes appear as regular volumes to the host and can be written to while still preserving the original copy.

Click here to enlarge image

What differentiates CDP from traditional backup and snapshot recovery techniques is its ability to provide near-instantaneous restore to virtually any point in time (APIT) without having to physically move or copy data. Like PIT snapshots, CDP requires significantly less disk space and less server overhead than full copies. And they both offer rollbacks to valid data in the event of a failure.

Click here to enlarge image

The difference between PIT and APIT is that, in a recovery process, with PIT one gets close to the desired recovery point. How close depends on the periodicity of the snapshots. This may be fine for many applications, but for critical applications, getting close is not good enough: It may be difficult, if not impossible, to know what transactions were missed or what file changes have not been captured in the recovery.

The Storage Networking Industry Association (SNIA) defines CDP as“a methodology that continuously captures or tracks data modifications and stores changes independent of the primary data, enabling recovery from any point in the past. CDP systems may be block-, file-, or application-based and can provide fine granularities of restorable objects to infinitely variable recovery points.”

Block-based CDP approaches work at the logical volume level. File-based CDP solutions work at the file-system level, keeping track of any changes to the file system, such as creation, deletion, or modification. Many IT managers in our survey expressed a preference for file-based CDP.

Snapshot techniques and CDP are gaining popularity. Though still one-third of the “random” survey population (which consisted of more than 4,000 respondents) do not really understand what CDP is (see figure on p. 30), among those who do know what it is, the percentage of sites that have implemented some form of CDP has grown from 34% last year to 45% in this year’s survey (see figure on p. 30). About 42% of the surveyed sites have no plans to implement CDP.

At sites that have implemented the technology, CDP is used for 30% of the company’s data, on average. Only a small percentage of the survey population uses CDP for more than 80% of their data.

Is the “zero data loss” associated with “true” CDP essential to all users? Not really. Many users still prefer snapshots over CDP, but more users rank CDP “very” or “extremely” important (see figure on p. 30).

Users cite demonstrated recovery capability and ease of use as the most important criteria in selecting a CDP product, and liability and compliance issues are the major incentives behind the purchase of CDP. This is true across all site capacity tiers.

Users also say it’s essential that CDP integrate with existing software and hardware, and remote replication is at the top of the list of services that users want to see combined with CDP.

End users’ data-protection priorities have remained fairly constant throughout the four years of our surveys, with one notable exception this year: security, which was ranked much higher than in previous surveys (see figure on p. 30).

The backup window has been a nightmare for administrators for some time, and the shift to continuous availability has made backup nearly impossible for some companies. Despite the popularity of snapshots and CDP, the backup window is not a thing of the past. Actually, the median of four to six hours does not differ significantly from last year’s findings (see figure on p. 34), and more than half of the survey population says this issue needs improvement.

CDP can significantly simplify the data-protection process. It removes the system overhead of batch-based backups, economizes on disk storage by only storing changes rather than a series of full or incremental backups, eliminates backup windows, and allows more-granular recovery points that take only seconds to minutes for individual files as well as entire systems.

Click here to enlarge image

CDP also eliminates the risk of doing major restores, especially as a result of data corruption, and enables effective and easy disaster-recovery testing. The CDP engine can create an image that can be used as the source for existing backup processes, and the backup process can be started without requiring that the application be brought down.

Click here to enlarge image

Enterprise resource planning (ERP) is the most common CDP application, followed by e-mail and Web services (see figure, below).

Interestingly, CDP usage extends beyond data protection. Most CDP implementations are also used to create copies of production data for reporting, audits, and other compliance related requirements (see figure, right).

The survey reveals that users want CDP to be easy to install, integrate with existing systems software, and not affect overall performance. It must also scale so it can back up the entire population of vulnerable desktops and laptops in an enterprise. Users also say CDP must offer integration with standard backup and archiving tools, and it should provide continuous protection from disk to tape, eliminating any protection gap.

CDP allows administrators to deploy multiple policies for local and remote data protection. The system centrally manages defined policies that are tied to business objectives and service level agreements.

With users today more likely to be connected to a network through high-bandwidth wireless connections, laptops can be protected continuously, and files that are corrupted or accidentally deleted can be restored to any point in time. Users no longer have to back up data through scheduled backup sessions.

CDP can be an essential tool for business continuity, minimizing revenue and/or productivity loss. With tight budgets, total cost of ownership and return on investment are important. The extra costs associated with CDP have to be weighed against benefits from backup/restore software license savings, savings from tape storage consolidation, and reduced administrative burdens. Enterprises also have to account for the risk and cost of not being able to timely reach mission-critical information. Companies dedicate on average 20% to 30% of their IT budget to data protection, and the ROI on CDP spending is estimated to be less than two years by a majority of the survey respondents.

Click here to enlarge image

Adopting CDP software does not mean that IT organizations have to replace their total data-protection infrastructure. Where traditional tape and replication technologies currently meet application availability requirements, business continues as usual. CDP is best-suited for protecting mission-critical applications where restore time must be measured in seconds or minutes.

CDP is also well-suited to data that is updated frequently. Less-critical or infrequently updated applications continue to be protected by other backup/restore techniques. Where there is a need to balance an improved level of availability with familiar practices and procedures, a virtual tape library (VTL) may complement existing data-protection procedures.

Farid Neema is president of Peripheral Concepts (www.periconcepts.com).

About the survey

The primary objective of the Peripheral Concepts survey was to determine the acceptance of continuous data protection (CDP) and snapshots, and assess IT managers’ selection criteria and requirements. The survey focused on IT managers having storage management or data-protection responsibilities for IT operations that store 1TB to 1PB of raw disk storage.

More than 4,000 qualified managers answered the initial screening survey of fewer than 10 questions. We then selected 117 respondents who filled out a more-detailed survey consisting of 60 questions.

The survey report provides statistics on budgets, ranks issues and needs, examines storage trends and acquisition plans, and analyzes responses by disk capacity range and by industry.

This article was originally published on April 01, 2007