The Enterprise Storage Group reviews the key trends and technologies in tape/disk backup and recovery, including replication techniques and new alternatives.
By Steve Kenniston
The Enterprise Storage Group (ESG) predicts that the worldwide market for all backup-and-recovery products will grow at a 16% CAGR through 2006. ESG also expects that the market for replication products (one of the fastest growing segments of the data-protection market) will grow at a brisk 23% CAGR through the same time period.
In 2002, IT storage managers were focused primarily on backup. In 2003, focus shifted to recovery. This year, IT managers are concentrating on the more all-encompassing field of "data protection." When done correctly, data protection ensures business continuity and disaster recovery (BC/DR).
Companies that have solid backup/recovery implementations in place can now move on to bigger and better things. These next steps include
- Decreasing recovery times;
- Providing BC/DR solutions for core data centers;
- Developing service level agreements (SLAs) for application recovery;
- Ensuring regulatory compliance; and
- Protecting data in distributed applications, remote offices, and desktop/mobile systems.
Implementing complete BC/DR solutions will increase the need for new data-protection and replication technologies, and a growing number of storage vendors are addressing these issues. Even vendors that sell hardware subsystems provide some level of data-protection software, such as snapshot capabilities.
The pain associated with data protection is a major reason why this market segment is growing. Organizations are less concerned with vendor stability than they are with finding products to solve their data-protection issues. For example, in a recent ESG end-user survey, vendor stability was rated seventh among the 10 most important reasons for deploying new disk-based data-protection technology (see bar chart). In another interesting finding, 52% of the respondents said that their current backup/recovery solutions leave their data somewhat exposed (see pie chart). As such, it's no surprise that IT managers are looking for new technologies to solve those problems.
Compliance regulations and corporate governance are also driving the need for new data-protection technologies. Better data protection and enhanced data-availability are key components of compliance. For example, a number of vendors have developed object-based storage appliances that address compliance and data availability issues.
Despite the trend toward disk-based backup/recovery, tape will continue to be the dominant target in data-protection implementations. Tape remains the best media for transporting data and storing it at remote locations. Compliance issues are forcing vendors to ensure that in the event of a disaster there are multiple copies of data available in multiple locations, accessible via multiple recovery methods. Additionally, some tape formats provide write-once read-many (WORM) capabilities that satisfy certain compliance requirements.
Security will also become a focal point over the next couple of years in two areas: as it pertains to data moving from one location to another for backups and replication, and as it applies to data "at rest" on disk/tape target devices.
On another front, an unexpected turn-around in the storage services provider (SSP) space is taking place. Outsourcing data-protection services is making a comeback, due in part to the lack of IT resources and expertise in certain areas. For example, many companies are outsourcing backup of desktops and mobile devices. IT managers know they need to protect these systems, but do not have the time or people to implement the proper solutions. In addition, some companies in regulated industries that do not have the requisite expertise to deploy compliance solutions are turning to SSPs. This is particularly true in the small to medium-sized enterprise (SME) market.
A wide range of technologies and solutions are being developed to better protect data. Some of these solutions are aimed at pure data protection, some are for BC/DR, and others play a larger role in information (or data) life-cycle management (ILM/DLM).
In the remainder of this article we will look at trends in various segments of the data-protection market. One thing is for sure: Disk-based backup and recovery is the dominant trend in data protection. An ESG survey shows that 58% of the responding companies have already deployed some form of disk-based backup, and another 25% will deploy it within the next year (see figure below).
Backup and recovery is a very large segment that ESG breaks into various categories.
Traditional backup/recovery software—These software products run on host servers and provide traditional backup and recovery functionality. Vendors such as Veritas, IBM-Tivoli, EMC-Legato, and Computer Associates have the lion's share of this market, although dozens of other vendors are players in various segments.
Disk targets—There are a number of vendors that provide inexpensive ATA/SATA arrays that act as targets for backup software. The theory here is that because disk is faster than tape, backups (and restores) can be completed in less time. A rapidly growing number of vendors now support writing to disk as a staging device when performing a backup. Vendors that have had some success in this space include LSI and Nexsan. Any array can be used as a target device for backup, provided the backup software supports the device. The downside of using disk targets as staging devices for backup is that a secondary backup may still be required to move the data to tape for archiving.
Virtual tape libraries (VTLs)—The VTL market is also becoming crowded, and for good reason. VTLs typically comprise an ATA/SATA disk array with some type of processing head (usually based on a Linux platform) that has software to make the array emulate tape. A VTL acts just like a tape library, providing an easy way to move from tape to disk-based backup without any process changes. Additionally, there is not a lot to learn: Most VTL systems allow administrators to configure the array so it looks like whatever library they have already deployed, again playing into the theme of minimal change.
The benefits of VTL software are found in its file-system capabilities. VTL solutions have a programmable file system that allows users to lay out volumes on disk in a way that emulates tape. These products also provide an easier and faster means of getting data from the VTL to the tape devices, as the data is typically already laid out on disk in the proper format for the tapes. Cloning or vaulting the data is faster than performing a second backup to tape. One drawback to VTLs (and to similar disk-based targets) is that to do a recovery, a restore operation from the backup product still needs to be performed.
Vendors in the VTL space include ADIC, Alacritus, Diligent, FalconStor (which partners with vendors such as Brocade, Copan, EMC, MaXXan Systems, and others), Quantum, Sepaton, and SpectraLogic.
Another interesting company is Data-Domain. Its product acts as a traditional disk target, but has a unique file system that performs both compression and "coalescence," allowing it to achieve a 20-to-1 file compression ratio, according to company claims. This significantly increases the amount of information kept available for recovery on disk and reduces the amount of storage capacity required for both disk and tape.
Snapshots and incremental capture (next-generation backup)—A wide variety of vendors offer some type of snapshot capability. The most common method is a copy-on-write technique. A snapshot is a copy of a volume that is essentially empty. However, it has pointers to existing files. When one of the existing files changes, the snap volume creates a copy of the original file just before the new file is written to disk on the original volume. As such, IT administrators have a second copy of data saved to disk that they can use for instantaneous recovery or as an offline copy for backups.
Software vendors with volume management capabilities, such as Microsoft and Veritas, also provide snapshot functionality.
Incremental capture, another category of data protection, falls into what ESG calls "next-generation backup." Vendors in this category, such as FilesX, have the capability to either replace existing backup technologies or co-exist with them. Incremental capture solutions can take snapshots at the block, file, or volume level. This provides users with more granularity when capturing data and offers unique integration capabilities with applications because these products typically write at the block level.
Another form of backup involves capturing only changed blocks and moving them in real-time to a secondary disk target for protection.
These technologies can replace existing backup products. Avamar is an example of a start-up in this space.
Continuous capture—This segment of the data-protection market includes software or appliances designed to capture every write made to primary storage and make a time-stamped copy on a secondary device. The main objective is to have the ability to re-create a data set as it existed at any point in time with the goal of being able to rapidly restore applications. Representative vendors include Alacritus, Mendocino Software (via acquired assets from Vyant Software), Revivio, and StorageTek. Even though these products are new, a surprisingly large percentage of ESG survey respondents (63%) said they were familiar with this type of technology. While it will be a while before these technologies become mainstream, today they are helping end users who need instantaneous recoverability for their applications.
Data from ESG's end-user survey reflects adoption patterns for various types of emerging data-protection technologies (see figure on p. 28).
Replication—Having your data backed up to tape or disk is one thing, but if the data still resides on-site or is on a tape that may be unrecoverable, problems can occur. Replicating data to a remote site gives companies a significant uptime advantage when it comes to data recovery. Replication products have become much more reliable, stable, and affordable, enabling even small companies to deploy replication for BC/DR implementations.
Array-based replication—These products have been around for a long time and have traditionally come from large disk-array vendors such as EMC, Hitachi Data Systems, and IBM. These products run on high-end arrays and are very robust (and expensive). They usually come in two flavors: synchronous or asynchronous. In the past, these replication technologies only worked between homogeneous arrays from the same vendor, requiring two expensive arrays with two expensive software licenses for each replication pair. As host-based replication became more robust, the array-based replication vendors began to add more flexibility in their solutions. For example, the requirement to replicate from one high-end array to another no longer exists, allowing companies to deploy lower-cost arrays at remote sites. Additionally, prices have come down, and new vendors are getting into the game. For example, vendors such as EqualLogic, Exagrid, and Intransa provide replication with their disk arrays at relatively low prices.
Although array-based replication will continue to be a viable alternative, ESG predicts that over the next few years the trend will be toward host-based and fabric-based replication.
Host-based replication—Host-based replication software runs on servers. As writes are made to one array, they are also written to a second array. Vendors in this category have eliminated many of the complexities in their products, making them easier to deploy and manage. Representative vendors of host-based replication software include EMC-Legato, DataCore Software, NSI, Softek, Sun, Topio, and Veritas.
Fabric-based replication—The new debate raging in the storage industry revolves around the following question: "Where should storage services, or applications, reside—on hosts, arrays, or in the fabric on switches or appliances?" Fabric-based applications are relatively new, but ESG expects a strong trend toward fabric-based intelligence over the next couple years due to a number of potential advantages. For example, the sooner an I/O is captured, the sooner it can be sent to a secondary device, thus enabling better performance. Examples of vendors with solutions in this space include Brocade, Candera, Cisco, CNT, FalconStor, IBM, Kashya, Maranti Networks, McDATA, and Troika. A variety of traditional switch vendors are putting intelligent blades into their core products, and third-party developers are porting their applications to the blades.
Tape libraries—Despite the interest in disk-based backup and recovery, the tape library segment of the data-protection market will remain healthy for the foreseeable future. For example, SMEs are taking data protection more seriously than ever and are moving toward libraries instead of stand-alone tape drives directly attached to servers. While large enterprises are now making decisions regarding the tradeoffs between disk and tape, smaller companies are buying more libraries and tape than ever before. Mainstay tape library players include vendors such as ADIC, Overland Data, Qualstar, Quantum, Sony, Spectra Logic, and StorageTek.
Another interesting trend among major tape library vendors is the development of disk/tape combo systems.
Most of the tape library vendors now have disk-based solutions that can be integrated with their tape libraries, enabling disk-to-disk-to-tape configurations.
The bottom line
The pace at which data-protection technology is changing makes it difficult for IT managers to draw a line in the sand and choose a solution that will carry them through the next three to five years. The general advice is to identify some type of data life-cycle management process that, when integrated with a disk- or tape-based solution, best meets the data-protection and data-recovery needs of your business.
Steve Kenniston is a technology analyst at the Enterprise Storage Group consulting firm (www.enterprisestoragegroup.com) in Milford, MA.
This article was excerpted and adapted from a larger report by the Enterprise Storage Group, called "What's in Store for 2004—Data Protection Market Drivers and Business Trends." You can view full ESG reports at www.enterprisestoragegroup.com.