In our annual reader surveys on backup and recovery over the past few years, it has been surprising how consistent end users are in identifying their “pain points.” However, what’s changed over the last couple years is that users seem to be making significant headway in eliminating, or at least diminishing, those pain points thanks to new technologies-most notably, disk-based backup/recovery and, to a lesser degree, related technologies such as continuous data protection (CDP).
In addition to the perennial complaints-high costs and an inability to back up within allotted windows-end users identified a number of problems with existing backup/recovery operations. For example, almost 44% of the respondents in our 2005 backup/recovery survey admitted that their backup operations were inefficient, and 38.6% admitted to an inability to validate the backup process (see Figure 1a).
The inability to accurately monitor backup operations is why backup monitoring/reporting software tools are popping up on the radar screens of many storage administrators.
The Taneja Group consulting firm refers to this relatively new product category as data protection management (DPM). Representative vendors of DPM tools include Aptare, Bocada, Crosswalk, (formerly SysDM). As evidence of how important DPM might be in 2006, EMC inked a reseller deal with WysDM this year.
DPM tools have evolved well beyond the basic backup monitoring and reporting functions that categorized the products a year or two ago. Today, these tools include-or will include in 2006-functions such as data collection, reporting (including backup success/failure reporting), trending and analytics, service levels and policy validation, process optimization, application recovery management, and data-protection workflow automation.
For detailed information on DPM software and how these tools can help you to accurately monitor and report on backup jobs, see the other Special Report in this issue, “Why you need data-protection management tools,” by the Taneja Group’s Brad O’Neill (p. 30).
Another key pain point identified by our readers is the inability to meet recovery time objectives (RTOs) and recovery point objectives (RPOs). End users are attacking the RTO issue with high-speed disk-based backup/recovery, and they’re beginning to address the RPO issue with CDP and “near-CDP” technologies (both of which are covered later in this article), which enable recovery to virtually any point in time.
About 64% of our readers have RTO and RPO metrics in place for their most critical applications, while 22% do not have RTO/RPO metrics in place. (The rest of the respondents are either not familiar with RTO/RPO or measure recovery time but not recovery points.)
Not surprisingly, some of the respondents in the InfoStor survey reported a lack of confidence in their backup/recovery processes, although the situation is not as bad as the vendor community has suggested. About half of the survey respondents expressed confidence in their backup procedures, but a third admitted that their recovery processes may be flawed, or are at least untested, although only 3.5% were certain that their backup/recovery processes were flawed (see Figure 1b).
About a third (31%) of our readers test their recovery processes once a quarter, while 24% test every six months, 23% test once a year, and 22% test their recovery processes less than once a year.
When asked why they don’t test recovery processes more frequently, 35% cited “other competing priorities,” followed by “impact on production systems” (26%), “lack of staff” (16%), “takes too long” (13%), and “cost” (10%).
Despite vendors’ claims that backup failure rates can be as high as 40% to 50%, 14% of our readers report no failures in their weekly backup jobs and three-fourths report backup failure rates of less than 10% (see Figure 1c). On the other hand, 11% of the survey respondents report failure rates in excess of 10%.
The primary causes of backup failures, in decreasing order, are media failure, user error, and failures attributable to software, the network, and hardware (see Figure 1d).
Despite the clear and rapid trend toward disk-to-disk (D2D) backup/recovery, most sites still rely primarily on tape-based backup. And, surprisingly, 35% of our survey respondents plan to purchase more tape over the next 12 months while only 12% plan to decrease the amount of tape-related purchases (see Figure 2a).
Less than 6% of our readers plan to outsource their backup/recovery operations to online services providers, but more than 20% plan to purchase backup monitoring and reporting tools, again reflecting increased interest in DPM tools.
Another clear trend is increased use of snapshots, mirroring, and replication, although users’ views on where to run these applications were surprising. A number of third-party surveys over the past year or two have indicated users’ preference for running storage services, or applications, in the SAN fabric-a trend sometimes referred to as “fabric-based applications.” In this scenario, the applications run on dedicated appliances or on “intelligent” switches such as those from Brocade, Cisco, and MaXXan. (For more information, see “Intelligent switches turn the corner,” p. 10.)
However, in the InfoStor survey almost half of the respondents said that they have no preference as to where they run storage applications such as replication. Only about 11% expressed a preference for fabric-based applications, while 41.1% prefer the “old-school” methods of host-based or disk array-based services such as replication (see Figure 2c).
Disk-based backup/recovery
The most dominant backup trend over the last couple years is disk-based backup and recovery. In fact, 63.5% of the readers in our backup/recovery survey have already implemented some form of disk-based backup.
Another 13% will implement disk-to-disk backup/recovery in 2006, while almost a fourth of the respondents do not have plans to implement this technology (see Figure 3a). Of course, disk-based backup comes in many forms. To date, the majority of the implementations are based on standard disk array targets, often with traditional backup applications from leading software vendors such as Computer Associates, EMC/Legato, Hewlett-Packard, IBM/Tivoli, and Symantec/Veritas. About one-third of our readers have deployed, or plan to develop, virtual tape libraries (VTLs), and other surveys have shown that VTLs are the fastest-growing type of disk-based backup.
And 15.7% of our readers rely on NAS devices for backup, as exemplified by Network Appliance’s NearStore product line (see Figure 3b).
CDP
In our 2005 backup/recovery survey, 21.5% of the respondents said that they have either already implemented, or will implement within the next six months, “continuous data capture” technology, which in most users’ opinions equates to continuous data protection, or CDP (see Figure 3c).
In a separate reader survey, when we specifically asked about users’ “continuous data protection” plans, 15.8% said they had already implemented the technology, 27.1% plan to deploy it in 2006, 41.4% have no plans for CDP, and 15.7% “do not know what CDP is” despite copious press coverage of CDP.
These statistics suggest that there is still some confusion among end users about what constitutes CDP, which is further complicated by the fact that some vendors’ implementation are “near CDP” while others are “pure CD,” (“periodic” vs. “constant”), and some are block-based and some are file-based.
To clear up the confusion surrounding CDP, the Storage Networking Industry Association (SNIA) has dedicated a working group to defining and promoting CDP. For more information, visit www.snia.org, or see “CDP: What it is, and why you need it,” InfoStor, September 2005, p. 42), which was written by members of SNIA’s CDP Special Interest Group.
According to Dianne McAdam, senior analyst and partner at the Data Mobility Group consulting firm, “CDP solutions continuously record any changes to applications, in either block or file format. When data is corrupted or accidentally deleted, CDP allows the volume or file system to be restored to the latest update or to any previous update point. This choice of many different recovery points is a significant benefit over traditional data-protection methods.” (For more information on the Data Mobility Group’s views on CDP, see “Evaluating continuous data protection,” InfoStor, October 2005, p. 32.)
David Freund, a senior analyst with the Illuminata consulting firm, says that “CDP uses disk technology to continuously capture updates to data in real-time or near real-time. The primary result is that backup windows become irrelevant, because backup is occurring all the time.
“The secondary result is that files are available at disk speeds.” (For more information on Illuminata’s view of CDP and its benefits, see “Continuous data protection: Is it about time?” InfoStor, October 2005, p. 28.)
Representative CDP and “near-CDP” vendors include Asempra, CommVault, EMC (via an OEM deal with Mendocino Software), FalconStor, FilesX, Hewlett-Packard (another Mendocino reseller), IBM Tivoli, InMage, Kashya, Lasso Logic, LiveVault, Mendocino, Microsoft, Mimosa, Revivio, StoneFly, Storactive, Symantec, TimeSpring, and XOsoft.
A recent survey conducted by Toigo Partners’ Data Management Institute, commissioned by Topio, sheds further light on end users’ attitudes toward, and plans for, purchasing CDP products (see “Survey points to CDP confusion,” p. 26).
Mixing tape and disk
Although the trend is clearly toward disk-based backup, typically with disk sub-systems based on Serial ATA (SATA) drives, that doesn’t necessarily mean that users are migrating away from tape. Although only 17.2% of the users in InfoStor’s backup/recovery survey rely solely on tape, almost half (46.6%) use tape as their primary backup media mixed with some disk-based backup. About 22% of the respondents have migrated to a disk-only backup/recovery model, and 23.3% back up primarily to disk with archived data on tape (see Figure 4a).
Although media removability is often cited (along with low cost) as the primary advantage of tape over disk, only 10.1% of our survey respondents cited removability as a key factor influencing future purchases of backup equipment (see Figure 4b).
We also queried users about who their primary data-protection vendors are. EMC/Legato and Symantec/Veritas tied at the top the list, followed closely (in decreasing order) by Hewlett-Packard, IBM/Tivoli, Quantum, Computer Associates, StorageTek, Network Appliance, and ADIC.
Survey points to CDP confusion
Interest in (although not adoption of) continuous data protection (CDP) technology picked up steam in the fourth quarter of this year, primarily due to the introduction of CDP and “near-CDP” products from heavyweights such as Microsoft (see InfoStor, August 2005, p. 1), IBM/Tivoli and Symantec/Veritas (October 2005, p. 1), and EMC (November 2005, p. 1). EMC entered the market via an OEM deal with start-up Mendocino Software, as did Hewlett-Packard.
However, despite those announcements and plenty of press coverage, CDP remains a mystery to many end users.
A recent survey, Data Protection and Recovery Snapshot Survey, conducted by Toigo Partners’ Data Management Institute, and sponsored by Topio, reveals that many end users don’t know what their company’s plans are for CDP and/or are not familiar with CDP (see figures). The Data Management Institute survey polled more than 200 readers of InfoStor.
For example, 25% of the survey respondents do not know what their company’s position is on CDP, although a high percentage of the companies are in various stages of considering adoption of CDP.
Only 7% of the surveyed companies have already implemented some level of CDP.
Similarly, 29% of the companies surveyed could not identify the key problems that CDP is supposed to solve, and of those that could identify CDP’s benefits the answers varied widely.
The survey also indicates that the majority of users don’t need the level of recovery time objective (RTO) granularity that some CDP solutions provide (e.g., any-point-in-time recovery), and almost half (45%) of the respondents didn’t know how granular their recovery points need to be.