New twists on disk-vs.-tape backup debate

Backup and restore should be based on a storage hierarchy that meets end-user requirements and should not be dictated by software limitations.

By Randy Thorburn

In just about every storage-oriented publication you read these days, there is an article that discusses the virtues of either disk or tape technology as a backup medium.

The same arguments are repeated over and over. Disk is faster for writing and reading data but has space, cost, and long-term viability issues. Tape is a better long-term storage medium but is hampered by moderate network performance.

The general consensus has been to use both. This is a seemingly viable decision, except when you consider the fact that existing backup software is limited in its ability to seamlessly take advantage of hard disk and tape in a single storage environment.

Some backup software products write to hard disk using a virtualized tape format, which makes it easy to move protected data to tape after it is on disk. However, you often lose the random access advantages of disk, so you end up losing most of the performance benefits. Other products can write to hard disk in a native file system format, but they use a simple TAR or copy command to migrate backed-up data to tape for long-term safekeeping. Although this works, it can create security issues. This approach also requires a two-step process to retrieve the data back from tape to the disk cache and then to the restore location.

Other backup products do the same backup two times. The first backup session runs to hard disk and a second session, covering the same data, sends it to tape. The overhead associated with this is obvious.

Click here to enlarge image


Data protection doesn't have to be so constrained. Configurations for data protection should be designed around performance and the longevity requirements of the data being backed up, rather than the limitations of the software managing the environment. It's time for a new way of thinking about the disk-vs.-tape dilemma—one that makes data protection work for the user, instead of the other way around.

For decades, technology has existed that allows software to manage data-protection configurations in a manner flexible enough to meet the needs of any environment. The Distributed Computing

Environment (DCE) was originally designed in the 1980s.

One component of DCE is a location broker agent, which enables the connection between clients and servers, in a client/server environment, to be dynamic. DCE has evolved to be a common element in almost every operating system available today. Use of the location brokerage service is common in printer sharing and other software, where several resources may change their status and state frequently.

Ideal data-protection configuration

An ideal data-protection configuration will have several storage resources available, some based on access performance and some on storage longevity requirements.

Software that uses the location broker agent of DCE lets backup clients choose the storage resource best suited for their data, based on rules applied to the backup set configuration.

Critical data encountered by the backup system is sent to a storage resource with high access performance, while non-critical data can be sent to a storage resource with lower access performance.

Another concept is to equip the servers with backup client capabilities. Backup clients locate data eligible to be backed up based on rules and schedules and send it to storage for safekeeping.

Applying these same principles to a storage server enables it to migrate and/or replicate protected data to other storage devices in the environment while maintaining full management of the data at the file level.

For example, a client backing up critical data will locate and send its data to a hard disk storage resource to get it protected as quickly as possible. A client-like agent on the storage server then locates all data that arrives on its storage device and moves or copies it to whatever tape-based storage resource is available. Now the data is protected on two storage devices. Add data-retention policies that leave backup data on the disk-based storage resource for three days, and on the tape-based storage resource for a month, and you have an interesting solution.

Randy Thorburn is vice president of marketing for Avail Solutions (www.availsolutions.com) in San Diego.

This article was originally published on December 01, 2003