D2D backup success depends on software

Low-cost arrays are making disk-to-disk (D2D) backup affordable, but don't ignore the software side of the equation.

By Bill Dunmire

Bill Dunmire
Legato Systems
Click here to enlarge image

New specialized appliances and ATA-based disk arrays at pennies per megabyte deliver terabytes of capacity, excellent reliability, and backup speeds that often outpace tape environments. Yet what has many IT organizations rethinking their backup strategy is the increase in recovery performance that disk drives offer.

Is the road to faster backup and recovery that simple? Not quite. The key lies in your backup software. If it doesn't measure up, neither will the expected benefits. So before you take the plunge, consider the following factors.

Ease of integration

When weighing disk-based backup options, consider the impact it will have on your staff and current systems. Will administrators need to learn new workflow procedures, communicate and enforce new policies and schedules, or face costly network and storage re-configuration?

How easily you can integrate disk-based backup into your existing storage infrastructure is an important consideration, especially when resources and time are limited. To minimize implementation costs and overhead, invest in software that can be readily deployed and leverage existing investments.

Multiplex without penalty

Enterprise-class data-protection applications often provide the ability (called multiplexing) to back up multiple clients in parallel to the same tape device. Client data is interleaved and sent in a single stream to tape, enabling the tape drive to write at maximum speed for faster backup. However, while multiplexing delivers improved backup performance, restoring a file(s) requires skipping over data until all pieces of the file(s) are assembled. Because tape is linear, this "skip-over" process translates into time and money. And the more that clients are multiplexed, the more impact it will have on recovery time.

Disk also enables you to achieve faster backups by interleaving client data and streaming to the disk device. But unlike tape, disk has random access read/write capabilities. So although data is backed up to disk in an interleaved fashion, it is restored (read) from disk in a contiguous manner at disk speed. As a result, you can gain the benefits of parallelism without paying a penalty in recovery performance—assuming your software supports these capabilities.

Replace, or enhance, tape?

Given the performance advantages that disk offers, is it time to replace your tape environment? That will likely depend on a variety of factors. Disk drives aren't portable, and shipping arrays off-site is not a viable option for disaster-recovery requirements. The key is having the option to choose between replacing tape and making disk your primary backup medium, or enhancing your tape investment by integrating disk-based backup in a disk-to-disk-to-tape (D2D2T) configuration (assuming your software provides this flexibility).

Automatic backup staging and cloning

An increasingly popular approach to leveraging the advantages of both disk and tape is to back up to disk on the front-end and migrate or copy data to tape on the back-end.

In a staging scenario, data is first written to disk and then migrated to tape based on policy thresholds. For example, you may have a migration policy based on a time metric or capacity thresholds. Once it is moved to tape, the data is removed from the primary disk.

In a cloning scenario, data is first written to disk and then copied to tape. Backup copies therefore reside in two places: on disk for fast restore and on tape for disaster recovery.

While staging and cloning are different types of operations, they both share a performance benefit when clients are backed up in parallel in an interleaved manner. Because of disk's random access, data is subsequently written to tape at tape drive streaming rates in a contiguous (non-multiplexed) format. As outlined earlier, the "skip-over" penalty associated with tape is now eliminated. This approach combines high-speed backup and high-speed recovery from tape.

One-step restore

If you stage or clone backups from disk to tape, the question remains: How will you perform a recovery? The answer depends on how your backup application copies data to tape.

Software that essentially performs a separate backup of a disk array to a tape device may have drawbacks. Say you've copied 100GB of data to tape, removed the data from disk, and a user needs to restore a single file. If your application performed a separate backup to tape, you will first need to restore the entire 100GB to disk to restore the client file. This two-step recovery process is painful and time-consuming even for simple recovery processes.

Make sure your backup software can seamlessly manage and track backup copies wherever they reside—disk, tape, or both—so that data can be restored directly to clients in a single step. This will avoid unnecessary overhead and costly downtime.

Treat disk like disk

Finally, does your software treat disk like disk, or like tape? If the backup application writes data to disk in a tape format, unnecessary tape header, file mark, and positioning information will consume time and disk capacity. Ideally, the backup application writes data to disk in a raw format, and in the scenarios illustrated above moves or copies data to tape in a tape format.

Treating "disk like disk" can be particularly beneficial when performing multiple data-protection operations. If your application is tied to a linear tape framework, you can either write data to disk or read data from disk. When your application has a true disk framework, however, you can fully leverage its simultaneous read/write capabilities. This will allow you to recover data from disk devices engaged in backups, or stage or clone data to tape while backups are underway. As a result, you can free your environment and staff from countless hours of operational overhead.

Disk is an increasingly attractive solution to achieve backup-and-recovery time objectives. Your level of success, however, will depend to a degree on your data-protection software.

Bill Dunmire is director of information protection product marketing at Legato Systems (www.legato.com) in Mountain View, CA.

This article was originally published on March 01, 2004