Users still need to use "data typing" to determine which applications are best-suited to disk-to-disk backup and recovery.
By Frank J. Sowin
Fast backup and recovery of data has been facilitated by two recent developments: disk-to-disk backup/recovery and "data typing," in which applications and data types are prioritized by importance to improve business continuance. These trends address pain points associated with managing information protection strategies and help achieve improved business continuance through effective execution of service-level commitments.
Disk-to-tape backup/recovery has been the dominant method of data protection for a number of reasons, including low-cost media and ease of moving tapes off-site for disaster protection and archiving. At most companies, backup is performed on a scheduled (daily or weekly) basis using full or incremental backups. Most restore operations are performed on a "best-effort" basis, which is becoming inadequate for critical applications where potential data loss affects profitability or customer data.
Compared to tape-based backup, disk-to-disk backup can offer a number of advantages, including
- Better performance (faster backup and restore);
- Enhanced data integrity;
- Improved reliability; and
- Better price/performance ratios.
Tape-based backup poses substantial management challenges to complete daily backups within the designated backup-time window. Recovery of data from tape involves risks that may impact business continuance because of manual management of media, as well as potential tape drive/library/media failures.
While backup consumes a significant amount of time and resources, it is the most critical step in data protection. Backup protects an enterprise in the event of data loss, corruption, or disaster. Tape is still the leader in terms of cost and value and is preferred for remote-site backup protection and disaster recovery. However, end-user attitudes are changing, in part because fast backup is critical to improve business continuance.
Demand for business continuity and the requirement for increased data/application availability is fueling the need for a strategy that accounts for shrinking backup windows and continuous uptime—all in the face of scarce IT resources.
Disk-to-disk backup is now a viable enhancement, or alternative, to disk-to-tape operations due in part to the emergence of disk arrays that rival the prices of tape libraries, as well as the performance advantages of disk drives over tape. After data is backed up to disk, further protection steps can be taken, including cloning or data staging from the disk to tape.
According to Dave Kenyon, product line manager for the Storage Solutions Group at Quantum (which makes both disk-to-disk and tape devices), "We expect users who use disk-to-disk backup to see a 2X to 3X [performance] improvement in backup-and-restore operations compared to traditional [tape] methods. A key advantage of disk-to-disk solutions is that users can restore more quickly."
The performance improvements of disk-to-disk operations are related to the faster speeds of ATA drives (which are used in most disk-based backup devices). Also, disk-to-disk solutions do not employ tape interleaving, which optimizes tape write speed but significantly degrades performance during the restore (read) process. The restore risks and dependencies associated with tape are eliminated with disk-to-disk approaches.
Declining disk drive costs and improved speed are key motivations for end-user adoption of disk-to-disk backup. Related benefits include improved end-user response times (due to faster restore rates) and decreased demand on IT staff resources because disk arrays do not require operator intervention. The improved reliability of disk drives and the fact that disk-to-disk backup makes it easier to do more frequent backups (which increases data protection) are also important.
Snapshot backup to disk
A snapshot provides a point-in-time image of data. Snapshots avoid the limits of the backup window, without business interruption. With the exception of data corruption, frequent backup can reduce the "maximum loss of data."
There are a number of ways that a snapshot copy can be taken, depending on whether it is managed from a server, appliance, or other snapshot-enabled devices. Snapshots increase data protection with more frequent backups at user-defined intervals. A snapshot allows the business to continue to function while servers, messaging systems, databases, or storage networks are in full production because an image of the data is taken independent of normal production operations.
Combined with snapshot capability, a disk-to-disk approach can be very powerful. This approach reduces business risks because backups are taken more than once a day. For example, the figure illustrates "time slicing" with different types of backup at varying frequencies. In Level 1 (the baseline), the backup begins at 12:00 Midnight and completes during off-peak hours as data is moved from servers, clients, and disk arrays to a tape library.
In the "baseline" once-per-day backup scenario, the maximum loss of data is 24 hours. For many critical applications, once a night is not good enough to achieve service level requirements.
In the figure, "backup time slicing" at Level 2 shows that the snapshot speed allows the backup to happen quickly (in this case, in the middle of the night). A Level 3 backup approach demonstrates that fast backups can be taken twice during a 24-hour period, thereby reducing the maximum loss of data to 12 hours. Finally, a Level 4 backup approach shows that if snapshot backups are taken every 8 hours, the maximum loss of data would be limited to only 8 hours. More-frequent backup improves service levels since there is a smaller time interval's worth of data at risk. For recovery, snapshot backups can be added incrementally to the point-of-time of the last full backup.
Use of snapshots to achieve high levels of data protection is not new. However, particularly in the context of disk-based backup, users have to determine which applications—or data types—are the best candidates for disk-to-disk backup/restore (see "Data-type valuation," p. 30).
Disk-to-disk backup helps restore data faster while increasing application availability. Recently, several approaches to disk-based backup have emerged. The approaches can be characterized as tape emulation, traditional backup software with enhancements, or network-attached storage (NAS) solutions (see table).
Alacritus, for example, takes a tape emulation approach. Its Securitus I software runs on a virtual tape library controller and presents disk arrays as a virtual tape library, or libraries, to the backup server. This approach leverages existing backup software and enhances the functionality.
The approach taken by backup software vendors such as Veritas includes disk-to-disk functionality integrated (at no additional charge) with existing software. The only change required is for administrators to modify the normal backup operating process to write data to a file located on the drive array.
NAS approaches couple primary storage with integrated or optional backup server software. The systems connect to the network via Ethernet on the front-end or via a Fibre Channel SAN. They may connect to a tape library on the back-end via Fibre Channel if the data management includes staging. The array typically includes ATA disk drives, redundant controllers, software for virtualization and management, and varying levels of availability features or internal redundancy.
For backup-and-restore operations, the key advantage of integrated backup software is that it can be faster because data is moved only once from the disk to the NAS or disk array device, and network traffic is reduced. Network Appliance's NearStore software uses incremental block transfers to back up data to the NAS drives.
Between 2000 and 2001, ATA drive prices dropped more than 43%, according to International Data Corp., making this approach more viable. And the price declines are continuing. In 2001, the average price of ATA drive arrays was $18/GB, and more recent products are in the $15/GB to $17/GB range.
Systems that leverage low-cost ATA technology will increase in popularity as drive prices continue to decline. Also, as new IP interface technologies such as iSCSI come to market, they will provide alternatives to Fibre Channel storage networks.
While a pure disk-to-disk backup solution cannot completely replace the overall functionality of a well-managed disk-to-tape operation, it is worthy of serious consideration, especially for applications and data types viewed as critical to business continuance. q
Frank Sowin is the founder of SOWINResources, a consulting firm that provides technology advisory services to IT managers, as well as marketing and business development consulting (www.sowinresources.com).
Behind the "data-valuing" approach is a key assumption: Not all data is created equal, and not all data is equally important in terms of recovery value to the business. For example, at an energy company, high-priority applications/data might include e-mail systems, seismic field data, and data for financial management and reporting. Prioritization of data value can yield improvements in business continuance.
For example, a telecommunications company that provides calling services must keep track of customer calls for billing purposes. The call records are extremely valuable because the loss of these records directly reduces the telecommunication provider's billing revenue. The loss of customer billing data translates directly into lost revenue if the records are not protected.
E-mail messages, for example, generally do not carry the same level of importance as call records. Hence, as a candidate for disk-to-disk data management, where data typing is applied, e-mail falls lower in the list of priorities.
Every company's requirements vary, but data-type ranking can be used to determine which needs would be best served by disk-to-disk backup. Of course, a number of other factors (e.g., direct cost of data loss, frequency of data changes, and alternative protection approaches such as snapshot backup) should be considered in prioritizing data types.
A ranking approach can be used to determine the relative valuation of applications or data. Below is an example of how one organization might prioritize applications for a phased deployment of disk-to-disk backup/recovery.
The key question is: In the event of a required recovery process, what data is most important to restore first, in order to get those systems most critical to customers and business operations back online?