Fueled by new software and ATA-based disk arrays, the trend toward D2D backup/restore is in full swing. But there are almost as many approaches as there are vendors in this emerging market.
By Dianne McAdam
Tape has been a fixture in corporate data centers over the last fifty years as a data backup-and-restore media. However, backup windows are shrinking, or disappearing, and enterprises are demanding that restores for critical applications complete within minutes.
Tape has always been—and will continue to be—a sequential access device. As a result, data restores from tape can require long sequential searches as the appropriate data is located and then read back to disk. For a growing number of storage administrators, restores from tape no longer meet the demands of applications that need to be available 24x7.
The continuing decline in the cost of disk, accelerated by the emergence of Serial ATA technology, has created new opportunities for disk in data-protection and business continuance strategies. Backing up directly to disk not only speeds up the backup process, but it also significantly reduces the time required to restore data.
However, storage vendors are responding to this new demand by implementing disk-to-disk (D2D) backup/restore solutions in very different ways. With some D2D implementations, disk emulates tape and appears to the operating system as if it were a physical tape drive. With other implementations, disk does not masquerade as tape, but responds to the operating system as disk.
D2D backup can be integrated with existing backup solutions to enhance and improve the overall process. Nevertheless, storage administrators must be aware of the implementation differences to determine the best solution for their environment.
One caveat: While disk has received a great deal of attention lately as a backup-and-restore media, it should not be viewed as an all-out replacement for tape. Rather, it should be viewed as an addition to existing backup processes. In fact, D2D should be considered as part of a larger backup strategy of primary disk to secondary disk to tape, or D2D2T. While disk can be used to store the most current versions of backups, tape continues to maintain its place in the data center for retention of long-term backups and archival storage.
Disk as tape, or "virtual tape"
The idea of using disk as an enhancement to the tape backup-and-restore process is not new. The first appearance of using disk in the guise of a tape drive dates back to 1997 when IBM introduced the concept of "virtual tape" for mainframes with its Virtual Tape Server (VTS). StorageTek's version of the VTS, the Virtual Storage Manager (VSM), started shipping shortly after in 1998. In these implementations, disk subsystems were placed in the backup stream between tape libraries and mainframes.
To concatenate the backup process, a disk buffer appeared to the operating system as a tape drive and responded to the operating system's standard tape commands (hence the term "virtual tape"). VTS and VSM were not stand-alone disk systems; they were tightly coupled disk caches in front of tape libraries. Data was written to disk first, then to tape later as the disk cache filled up.
The virtual tape concept has been recently applied to the open systems market by several vendors. Unlike its early predecessors, however, this new generation of virtual tape is not tightly coupled to tape libraries. In fact, some vendors regard tape drives/libraries as optional.
In these implementations, the target device for backups is usually an ATA-based disk array that responds to software commands just as if it were a tape drive. Some vendors use proprietary software that works only with their own disk subsystems (such as Quantum's DX30), while other vendors use host-based software that works with any vendor's disk array (such as Diligent Technologies' Virtual Tape Facility, or VTF).
Implementing virtual tape generally requires few changes to an existing backup infrastructure. Usually, the disk array and software can be easily plugged into existing backup products such as Computer Associates' ArcServe, Legato's NetWorker, and Veritas' NetBackup. The normal process of directing the backup stream to tape is simply redirected to disk.
Some virtual tape products, such as those from Diligent and NearTek, are software-only solutions that allow storage administrators to be agnostic with regard to the vendor of the disk array. Bus-Tech (Mainframe Appliance for Storage) and Ultera (Mirage Virtual Tape Controller, or VTC) are examples of vendors that supply virtual tape controllers that use existing disk to create a virtual tape library. Other vendors such as ADIC, Alacritus, Breece Hill, Quantum, and SANgate supply integrated hardware/software solutions.
The virtual tape approach to disk-based backup/restore can be appealing because it's easy to install and can immediately improve backup-and-restore times. However, virtual tape systems store backups as tape images—not disk images. Therefore, during a restore process the tape images must be restored back to disk before they can be used, just as they would be if tape were the primary backup-and-restore media. Storage administrators who want to eliminate this restriction have other options for D2D backup, most notably disk systems that respond as disk—not tape.
Another segment of the D2D market—"disk-as-disk"—includes disk systems that do not emulate tape. Compatibility with existing backup-and-restore software depends on whether the software supports disk as a backup-and-restore media.
Disk-as-disk backup/restore can be implemented in several ways. One approach uses traditional snapshot or copy technology to send a copy of data to a secondary disk array. The snapshot process is invoked again to restore data. Other approaches provide integrated hardware and software appliances that can be targets for backup within a storage area network (SAN) or over a LAN.
A third approach to D2D backup and restore makes use of specialized appliances that address specific problems that have plagued backup operations for years.
A logical copy technique known as snapshot copy is used by Network Appliance, for example, to provide D2D backup-and-restore functions in network-attached storage (NAS) environments. NetApp's NearStore allows files stored on a primary NetApp filer to be copied to a secondary NetApp filer consisting of ATA disks. Then, using Net- App's SnapRestore software, a storage administrator can restore files from NearStore back to the primary NetApp filer.
D2D backup over SANs/LANs
While Network Appliance's NearStore requires two NetApp filers—one for primary storage and one as a backup target—other vendors have delivered ATA-based disk products that can serve as backup targets for any other vendor's primary disk storage. For example, Overland Storage's REO series uses iSCSI and Serial ATA disks to back up files over an Ethernet network. StorageTek's BladeStore uses 2Gbps Fibre Channel connections to provide D2D backup over SANs.
Nexsan Technologies' InfiniSAN ATAboy series uses ATA disk arrays plus proprietary software to back up files in their native format, eliminating the need to convert backup data from its stored format to its original format.
Software-only solutions, such as Avail Solutions' Integrity, work in DAS, NAS, or SAN environments.
Within the disk-as-disk segment are products that not only provide disks as targets for backup, but also incorporate specialized software to solve problems that have traditionally plagued storage administrators:
File-system consistency problems—The basic intent of backup software is to copy data from its original, primary location to a secondary location. However, if the file system that is currently being backed up is corrupted, the backup will simply mirror the same corrupted state. If the corruption goes undetected for days, then several days of backups can be rendered useless. Data Domain's DD200 appliance, which serves as a backup target, runs consistency checks against the file systems to flush out integrity problems when the backup process completes. If a problem is discovered, the DD200 appliance will attempt a fix. If it cannot fix the problem, then the operator will be notified that a backup error has occurred so the problem can be fixed before the problem surfaces during a restore.
Precise point-in-time restore—Backups are normally scheduled at certain times during the week or day (often, once every night). If a storage administrator is running nightly backups, a problem detected at 4 PM one day would require restoring data from a backup taken at 8 PM the previous day, resulting in a loss of 20 hours of processing. While database logs can be used to recover the lost 20 hours of processing, the procedure can take hours to complete.
StorageTek's EchoView is an ATA disk and software solution that creates one full backup of a volume that resides on the EchoView appliance. After the initial full backup is created, all subsequent writes to that volume are also sent to the EchoView appliance. When a restore is required, EchoView has the ability to present a backup image of the volume—at any point-in-time—as if a full volume backup had been made.
In contrast to EchoView, Revivio has developed a software-plus-appliance approach that restores volumes from any point-in-time as well, but is not tethered to a specific vendor's disk subsystem (as is StorageTek's). Another vendor in this category is Vyant Technologies.
Reducing storage capacity requirements—In situations where D2D capacity is a primary concern, Avamar's Axion backup software and disk subsystem stores data as objects to reduce the space required to store secondary data copies for both backup and archival purposes. Axion converts the backup stream to a collection of stored objects and then uses a commonality filtering technology to find and eliminate redundant sequences of data.
Data encryption and regulatory compliance—EVault's InfoStage is a software-only solution that uses existing disk arrays. Agents are installed on the servers requiring backup. The software manages the backup process and encrypts the data throughout the backup, transfer, and restore activity. Backup data is compressed to reduce storage requirements. InfoStage has received certification for SAS70 and HIPAA compliance.
Choosing the best solution
Choosing the right backup-and-restore solution was easy when tape was the only affordable option. Today, however, cost-effective tape and disk options abound.
Virtual tape or disk-as-disk?
A simply way to sort through the many D2D implementations is to determine the requirements for the secondary disk.
- If data protection is the primary requirement, then both the virtual tape and disk-as-disk solutions will provide backup protection of data as well as restoration capabilities. Be sure that the secondary disk itself is adequately protected by RAID so that a disk drive failure will not cause the loss of the backup image.
- If compressing the backup window is the primary requirement, then again, both the virtual tape and disk-as-disk approaches will speed up the backup process. Be sure that the solution has the throughput required to ensure this objective can be met.
- If reducing restore time is the primary concern, disk-as-disk approaches may be better. Products that can restore to any point-in-time add another dimension to rapid restore requirements.
- If reducing storage capacity requirements is a primary concern, then solutions that provide compression and/or treat files as objects that are stored once only should be investigated.
If virtual tape, which one?
To determine which virtual tape approach is the best fit with an existing environment, storage administrators should ask vendors the following questions:
- Which tape libraries and drives (and how many) does the software emulate? Are these drives similar to what is already installed?
- What backup software is supported? Is that software currently installed, or will you be required to bring in additional software—and at what cost?
- For integrated hardware/software solutions, what type of disks are used? What level of RAID protection is available? What is the maximum capacity of the disks? What server connections are supported—SCSI, Fibre Channel, ESCON, FICON, iSCSI?
- What is the throughput of the system?
- How long does it take to install the product?
- How well does it integrate with existing tape drives and libraries?
- How many different backups can be stored on virtual tape? How many versions of the same backup can be stored?
- Should the product be used for all backups or only selective backups?
- How is the product managed? Is there a remote management facility? Is the management software included in the price of the product? How well does the management product interface with other backup products already installed?
- How easily are backups migrated from virtual tape to physical tape? Does this require manual intervention, or can it be easily automated?
If disk-as-disk, which one?
Choosing the most appropriate disk-as-disk solution is a more difficult task than choosing the best virtual tape solution because there are many more options available. Users should ask vendors the following questions to help sort through the various products:
- Which operating systems are supported?
- Is data backed up on a volume or file level, or both?
- What databases are supported, if any? Are specific agents available for applications such as Microsoft Exchange? What is the cost of these agents?
- What is the minimum and maximum capacity of the D2D system? Can you easily upgrade to larger capacities? What RAID levels are supported? What server connections are supported—SCSI, Fibre Channel, ESCON, FICON, iSCSI? What are the minimum and maximum number of connections supported?
- How long does it take to install the product?
- What is the throughput of the system?
- Does the solution use its own backup software or does it support existing backup software?
- How many backups, and how many versions of backups, can the configuration support?
- Will the product be used for all backups or just selected backups?
- When restore is required, is data restored to the original primary disk or can it be accessed on the D2D system? How granular is the restore process? Can data be restored to a specific point-in-time, or is a full backup required?
- Is compression supported to reduce capacity requirements?
- What are the projected restore times?
- How is it managed? Can it be remotely managed?
- How easily can disk backups be migrated to tape? Is this a manual process or can it be automated?
The continuing decline in the cost of disk, accelerated by the emergence of Serial ATA technology, has created new data-protection and business continuance opportunities for end users. However, D2D should be viewed as an enhancements to—and not necessarily a replacement for—existing backup-and-recovery strategies. Despite the many D2D solutions, recent surveys indicate that storage administrators will be buying more tape next year, not less.
D2D solutions are effective for reducing, and perhaps even eliminating, the backup window through the use of non-disruptive disk copy functions. They are also a key enabler of business continuity plans that call for rapid application recovery in the event of a failure.
Nevertheless, D2D solutions are prone to the same vulnerabilities as any disk subsystem. As such, they should be RAID-protected and backed up to tape as well.
Dianne McAdam is a senior analyst and partner with the Data Mobility Group consulting firm (www.datamobilitygroup.com) in Nashua, NH.