Understanding disk-to-disk backup

A look at four types of disk-based backup approaches and their advantages and disadvantages.

By Steve Kenniston

End users are increasingly looking for alternatives to traditional backup-and-restore methods, and disk-to-disk backup is looking more and more promising as a way to combat shrinking backup windows. The goal of this article is to help users and integrators understand the differences between the following disk-based backup alternatives:

  • Disk-to-disk backup using existing backup software;
  • Disk-based backup that takes advantage of data movers. Vendor examples include EMC (Centerra), Nexsan, Network Appliance (NearStore), and Okapi Software;
  • Disk-to-disk backup using "virtual tape libraries" (VTLs) from vendors such as Alacritus, StorageTek, and Quantum; and
  • Disk-to-disk recovery technologies from vendors such as Revivio and Vyant, which are focused primarily on the database market.

In addition to backing up to disk, most of these approaches involve subsequent data transfer to tape.

Use existing software

Most of the popular backup software packages allow a backup target device to be a disk or tape device. For example, say an IT administrator uses Legato's Networker to back up data from primary disk storage to a tape library with 40GB cartridges. In an effort to gain better backup (and restore) performance, the administrator implements a new disk-based subsystem (that uses 120GB ATA drives) as a target device that stages data prior to migrating the data to tape.

The process Networker follows to move the data from primary storage to disk and then to tape is as follows:

  • Converts the data to tape archive format;
  • Moves the data in the tape archive format to disk;
  • Creates a media management schema that follows how the data is laid out on the 120GB disks; and
  • Clones the backup image from disk to tape.

(If the backup software does not support cloning, a second backup from disk to tape is required, creating a new media management schema for the 40GB tapes.)

The major advantage of using existing backup software for disk-to-disk backup is that implementation is very simple, and the initial backup to disk is very fast. This approach is also relatively inexpensive and allows users to leverage existing investments.

However, there are a number of drawbacks to this approach, including

  • Secondary data movement, or cloning, to tape can be slow and take an inordinate amount of server backup processes;
  • If cloning is not used, a two-stage process is necessary for backup and recovery, which can negatively impact data recovery time;
  • This approach is limited by file system performance and scalability; and
  • The disk subsystem cannot be easily shared by multiple backup servers.
    • Data movers

      A variety of vendors are shipping "data mover" technologies that move, or copy, data from primary disk storage to secondary disk storage very quickly. Some of these products are implemented in appliances based on either a 1U Linux-based device or an ATA-based disk array with embedded software.

      The goal of these products is to move data from primary storage to secondary disk storage for two purposes: to use the secondary, low-cost disk as the backup data staging area, and to perform tape-based backups on the secondary storage array.

      Most of the array vendors in this space have certified their low-cost disk subsystems as target devices with the leading backup software products.

      In the case of Nexsan's approach, this is how it works: The Nexsan appliance moves the data (via its own software) from primary storage to a Nexsan disk array. The format is the same format that the data was originally written in.

      Next—should the administrator choose to do so—the existing backup software in the environment performs a tape archive backup of the data, creating a media management schema that follows the schema of the tapes used in the library.

      The "pros" of this approach are the following:

      • Very quick initial copy of the data;
      • Ability to have longer backup windows; and
      • The media management schema from tape allows the data to be recovered directly to primary storage, thereby making recovery time the same as it would have been going from disk to tape.

      The "cons" of this approach are the following:

      • Creates an additional step in the backup process;
      • The software for the initial backup is not as mature (in terms of application integration) as that offered by Computer Associates, Legato, Veritas, etc. And the software is typically not certified to work with applications such as Exchange, Oracle, SAP, etc.;
      • Not scalable for large quantities of data; and
      • Only a single copy of data can be stored on the secondary disk.

      Virtual tape libraries

      Virtual tape libraries (VTLs) are an even newer approach to disk-to-disk backup. In this approach, software loaded onto or embedded in a secondary disk array is designed to make the array look exactly like any tape library (preferably the one already in your environment), either through software or firmware on the controller.

      In the case of Quantum's DX30, the VTL is already configured to look like a specific Quantum tape library. In the case of Alacritus' Securitus software, an administrator programs the array to look like a StorageTek PowderHorn tape library with 142 slots and 60GB tapes. The administrator configures the array to have the same size tapes and the same number of drives in the virtual library as in their tape library.

      Unlike the first approach to disk-based backup—where the backup software writes the data to disk in tape format using a media management schema that represents the disk drive size—the VTL approach writes the data in tape format, and the media management schema follows the schema of the tapes to which the data will eventually be moved.

      The configuration of a VTL environment is very simple. The only change that needs to take place is to provide the backup software with the Fibre Channel or SCSI address of the new disk array. The VTL leaves it up to the backup application to move the data from disk-based virtual tapes to physical tape volumes.

      The "pros" of the VTL approach include the following:

      • Fast backups to disk array;
      • By maintaining media management schemas, it is possible to recover directly to primary disk;
      • Easy to set up; and
      • May enable a single library to be concurrently used by multiple, heterogeneous backup software applications.

      On the downside, the "cons" to this approach include the following:

      • Creates an additional step in the backup process to move data to physical media via backup application or scripted software;
      • Depending on the implementation, the VTL may have limited scalability; and
      • Suffers from the same limitations as a physical tape library (e.g., limited number of drives).

      In addition to the core functions already described, newer implementations of VTLs have at least two additional capabilities:

      • They can seamlessly copy virtual tape volumes to physical tape volumes. This is done in a way that works with existing media management schema and does not involve a second backup step. As such, it does not place a significant burden on the backup server; and
      • They can present themselves simultaneously as multiple, different VTLs. This allows for simultaneous access from multiple backup servers and solves the problem of trying to share a single device.

      These features can automatically generate physical copies of virtual media by cloning virtual tapes to physical tapes. This complies with media management schema by matching virtual bar codes to physical bar codes. Further, this approach allows a site to centralize all backups by pointing each backup server at its own virtual library, all within the same appliance. To date, only Alacritus has added these features.

      Although adding these new technologies to the core VTL concept may create added cost, it provides advantages such as better scalability and unlimited number of virtual drives and libraries, allowing for easy centralization of backups. It also eliminates a second step to create physical media.

      Disk-based recovery technologies

      Lastly, there are a couple of vendors—such as Revivio and Vyant—that are building database recovery technologies. Typically, these are file system-based products that capture every piece of information about how the data was written to disk, and copy not only the data but how the data was written to a secondary disk array.

      This approach gives users the ability to essentially perform an "undo" function and step back through time to quickly recover, or restore, the database to a specific point in time. A key advantage to this approach is very rapid recovery of databases. (Typically, databases are the most difficult applications to recover.)

      However, "cons" include the following:

      • It's relatively expensive;
      • Typically requires database administrator (DBA) expertise;
      • Requires 2x (or more) disk space;
      • Regular archival backups are still required; and
      • The technology is relatively new and unproven.

      This article has outlined four categories of disk-to-disk backup that the Enterprise Storage Group has identified. Of course, not all vendors' products fit neatly into these categories (see vendor list for a compilation of disk-based backup vendors), and new categories will emerge as new technologies and approaches appear. But one thing is certain: There are plenty of options for storage administrators who are looking for alternatives to the traditional methods of backing up data. q

      Disk-to-disk backup vendors (partial listing)

      • Alacritus
      • Avail
      • Avamar
      • Connected
      • EMC
      • EVault
      • LiveVault
      • Network Appliance
      • Nexsan
      • Okapi Software
      • Quantum
      • Revivio
      • Sony
      • StorageTek
      • SwapDrive
      • Vyant

      Note: Most of the leading backup software applications can be used in disk-to-disk backup environments (e.g., Computer Associates, Legato, Tivoli, and Veritas).

      For more information about disk-to-disk backup and related technologies, see the following articles that have appeared in

      "Disk-based backup options multiply," December 2002, p. 1
      "Making the case for real-time backup," December 2002, p. 26
      "Disk-to-disk increases backup-and-restore speeds," November 2002, p. 24
      "Is serverless backup your best bet?", November 2002, p. 32
      "Serial ATA activity picks up," September 2002, p. 11
      "Vendor group advocates disk-based backup," July 2002, p. 8

      Steve Kenniston is a senior analyst with the Enterprise Storage Group.

      This article was originally published on February 01, 2003