Making the case for real-time backup

Potential benefits include increased data protection, faster backup/restore times, and elimination of the backup window.

By Steve Coppock

Many different factors are compelling storage administrators to re-evaluate traditional backup software architectures. The distribution of users and customers across wide-ranging time zones now requires that applications and data be available 24x7, making backup windows of any significant duration unacceptable. Most companies need up-to-the-minute data protection, without sacrificing high availability.

When backup software was initially designed, technology and cost limitations hindered the achievement of these goals. Today, however, backup software should provide fast, transparent data protection that eliminates the need for additional network traffic and enables immediate restoration of data at any level (from a single file to complete bare-metal disaster recovery).

Click here to enlarge image

New backup technology should also integrate seamlessly into existing backup infrastructures with minimal additional hardware requirements.

Some of the more recent approaches to backup are based on real-time versioned architectures that can reliably handle open files and complex application data. Data-compression techniques are used to extend the life of existing tape subsystems. In addition, disk-based storage hardware prices have hit an all-time low, enabling cost-effective disk-to-disk backup.

Real-time backup protects data as changes occur, eliminating the potential of data loss between backup windows. This allows enterprises to eliminate the backup window with up-to-the-moment backups that enable administrators to easily restore any version of any file, for any user, at any time. And real-time backup can be integrated into existing schedule-based backup infrastructures.

Problems with schedule-based backup

The most common data-protection strategy is to perform backups on a scheduled, perhaps daily, basis, in which data lost between scheduled backups is unprotected.

Also, because of the growing amount of data in enterprises, the practice of confining backup times to "off-hours" has become increasingly difficult.

Servers that support 24x7 operations effectively have no "off-hours," which means that backups must be performed while the system is in use, potentially causing performance degradation or downtime.

In addition, each server in the enterprise must be backed up individually—a costly and inefficient use of both hardware and personnel. Traditional backup procedures back up to—and recover from—tape, which is slower than moving data to/from disk-based systems. Disk prices are almost equivalent to tape prices, yet deliver significantly better performance.

Real-time versioned protection

As organizations' requirements for high availability increase, their tolerance for downtime decreases. These factors, coupled with falling disk subsystem prices, are the catalyst for technologies such as

  • Versioning, which allows storage of multiple versions of data so that fine-grain, point-in-time recovery can be achieved back to a known "good" state prior to data loss;
  • Real-time backup, which enables data to be backed up as changes are made, rather than on a scheduled basis;
  • Compression, which minimizes network traffic and storage capacity requirements; and
  • Disk-to-disk backup, which leverages the speed and low costs of disk arrays and significantly speeds recovery time.

By combining these data-protection techniques, real-time versioned backup provides up-to-the minute data protection and fast restoration of the most recent data after corruption or loss.

Unlike scheduled-backup systems, real-time backup tracks changes as they occur while providing a configurable versioning scheme that enables administrators to balance quality/granularity of protection within cost requirements.

One of the key factors that enables real-time backup is the combination of mirroring and versioning. In this approach, files are first continuously monitored for changes and then automatically and immediately mirrored to a server. Mirroring is augmented with versioning capabilities that maintain an audit trail of all changes to protected files, so users can easily roll back to previous file versions if the current versions are damaged.

Real-time backup is tightly integrated with the server file system to protect a file each time a file I/O operation (such as a save or database commit) occurs that changes the file contents. This means that difficult-to-protect applications such as databases or e-mail files can be protected in real time, even if they remain open for prolonged periods of time.

Performance issues

For real-time protection to be effective, high performance is key. The following functions minimize the impact on network performance and storage requirements:

  • Incremental, block-level backup—To protect large files with minimum impact on network load, real-time backup backs up only those blocks (or "zones") that have changed;
  • Continuous, rather than scheduled, protection—By mirroring and versioning data on a continuous basis, real-time backup levels network loading; and
  • Client-side caching—All data flow from client to server is buffered on the client. This guarantees that operation of the client is not adversely impacted by network or server load. If the server or network is momentarily congested, the client software holds data in the local cache until server/network capacity is available.

Real-time backup can be complementary to traditional schedule-based backup systems. When deployed between the server farm and the archive server, a server running real-time data-protection software will enable enterprises to compress the data to be backed up and eliminate duplicate files. This reduces the total volume of data to back up, shrinking backup times and extending the life of current storage resources.

Typically, one real-time backup server can protect up to six other servers. This means that fewer servers (and less data) must be backed up to tape, helping to control growing enterprise data management requirements and providing tighter data security. Data consolidation onto fewer servers also reduces administrative overhead and allows centralized administration.

Data vulnerability

By capturing changes as they happen, real-time backup also eliminates data vulnerability between scheduled backups, which means that protected data is always current and can be recovered up to the minute a loss occurs. Unlike simple mirroring, real-time versioned technology captures every version of every file and stores just the changes, so that any version can be recovered. Another benefit is that real-time backup restores data from disk, not from tape.

Adding a real-time backup server to an existing data-protection infrastructure allows IT organizations to control their growing data management requirements by consolidating servers, compressing data, and eliminating duplicate files. When combined with administration software, real-time data-protection architectures increase efficiency and reduce administration overhead, extending the life of existing data-protection infrastructures.

Steve Coppock is executive director of product management at Storactive (www.storactive.com) in Marina del Rey, CA. He can be reached at scoppock@storactive.com.

This article was originally published on December 01, 2002