Falling prices, near-immediate data access, and data replication technologies may make disk an interesting alternative to tape for backing up mission-critical data.
BY HEIDI BIGGAR
Tape is dead. It's a story so often told that it barely raises an eyebrow in the industry anymore. But the reality is that for certain applications and in certain environments, tape is slowing replaced by disk.
"We're already seeing that the price of disk storage is allowing companies to replicate mission-critical data to disk instead of backing up to tape," says Tom Cox, director of development with Articulent, a storage management service provider in Hopkinton, MA. "Will this have an effect on backup applications over the next three years? Absolutely."
But the effect will be gradual. "[The premise] becomes intriguing five years out," says Bob Amatruda, a senior research analyst with International Data Corp. "It's much easier to restore from disk than it is from tape."
This is especially true with large databases, where it can take hours, if not weeks, to restore files from tape or with e-business and other OLTP applications that require 24x7 up-time (see table).
Perhaps no one has been more vocal on the topic of tape versus disk-based backup than Mike Ruettgers, former EMC president and CEO. In fact, at Dataquest's StorageTrack conference in 1998, Ruettgers predicted the tape library market would shrink to 1/10th its size by 2001.
Figure 1: recent tape drive developments are aimed at improving capacity and access.
Though his prediction was wrong (the market has actually grown 15% annually), it did portend a shift in the industry toward disk-based backup in some traditional tape markets.
"The nature of tape is changing," says Gary Francis, vice president of corporate planning at StorageTek. "I would certainly agree with EMC on that, but the notion that tape is going away is quite different." Francis says there is a place for both active and archival data on tape, depending of how active the data is.
Studies suggest that within 10 to 14 days of first writing data to disk, the active nature of data starts to decrease, explains Francis. "So while you may want to keep data on disk for that period to take advantage of disk's performance and random-access capabilities, at some point it starts to make much more economical sense to archive that data out to tape."
"At some point, making snapshots of data to disk becomes cost-prohibitive, and there's no good way of putting something off-site that's on disk," says Britt Terry, director of marketing management at Spectra Logic. While it is possible to mirror or replicate data over a wide area network (WAN) to off-site disk, it often doesn't make economical sense, explains Terry. "If I have to make 50 copies of a 10TB database over the next year, I can't be buying 500TB of disk to keep that data on," he says.
It all boils down to the value you place on your data and the value you place on being able to retrieve that data quickly, says StorageTek's Francis.
Figure 2: Tape prices typically decline 5% to 15% per year.
"We agree that there is certain data that needs to be replicated to disk, and that needs to be brought back right away," says Steve Whitner, director of marketing at Advanced Digital Information Corp. (ADIC). "Our disagreement is that it doesn't apply to every piece of data."
Although pricing varies widely, disk generally costs $0.15 to $0.20 per MB, versus $0.001 per MB for tape. At these prices, disk-based backup plays mostly in high-end environments, says IDC's Amatruda. "You have to frame this issue from an application perspective," he says. "Small businesses aren't going to replicate data because they might not be able to afford expensive redundant disk arrays."
In addition to cost, other factors to consider are power consumption, management costs, media utilization and compression capabilities, and storage density/footprint-all of which currently favor tape.
However, as disk prices drop, the equation is expected to change. "Companies are going to be willing to invest in disk, and they are going to be keeping the last four or five days' worth of replicated data on disk, not just last night's data," says Articulent's Cox.
The role of replication
Replication plays a key role in evolving backup trends. Essentially, companies are using replication techniques with products such as EMC's TimeFinder to take point-in-time copies of their data throughout the day and then back up that data to tape once or twice a day.
"Replication is like an incremental backup," explains Spectra Logic's Terry. "It backs up only those blocks of data that have changed since the last snapshot."
The benefits of replication and snapshot copies include speed and increased data protection.
Instead of having to go back a full day or a half day to retrieve data in the event of system error, replication allows users to retrieve a full copy of data up to the minute before the system went down. Also, it enables users to perform scheduled backups within a defined backup window.
While there is a growing trend toward replicating data to disk, there is still a segment of the market where replication is simply cost-prohibitive.
"We're starting to see people replicating several days' worth of data to disk," says Articulent's Cox, "but I don't think anybody has turned off tape backups."
The reason: Not only is there a certain comfort level that goes along with backing up data to tape, but users simply aren't going to replicate huge amounts of data to disk each day.
"They might replicate a couple of terabytes that are mission-critical to their business, but they are still backing up the other pieces to tape," explains Cox.
So, while vendors and analysts alike expect to see a growing tendency for users to use disk-based replication techniques to back up data, no one expects that change to occur overnight.
"The approach to data protection is built around sequential access devices, ...so there is a whole world of capabilities and operating procedures [that will need to change], but that will involve changing people more than changing technologies," says Kevin Daly, CEO at Quantum/ATL.
One way that library vendors are addressing performance issues is by mixing technologies and integrating caching or buffering capabilities into hybrid tape products. "Hybrid solutions make a lot of sense," says Daly. "One way to do it is to provide backup for up to a week or so on random-access disk and to only use removable devices when data gets older than that."
Daly says Quantum/ATL has been working with Veritas to adapt Net-Backup's disk-based incremental backup capabilities into a mixed library environment with disk and disk caching.
Similarly, ADIC plans to introduce products this year that combine disk and tape and enable automatic movement of data back and forth in a storage area network (SAN). Last year, ADIC introduced its StorNext network-attached storage (NAS) device, which combines disk and tape to provide a low-cost alternative to disk-based NAS appliances and off-line vaulting for storing important, yet infrequently accessed, data.
"That doesn't mean that tape is trying do what disk does; it's that tape does some things very effectively so tape in combination with disk makes a lot of sense," says ADIC's Whitner.
As for StorageTek, Francis says they've been mixing disk and tape for a couple of years via the Virtual Storage Manager (VSM) virtual tape configuration. Essentially, VSM provides a disk cache that sits in front of a tape library to improve performance and tape utilization.
Aside from performance, library vendors say they are tackling a number of other issues, including ways to increase tape densities, reduce costs, and improve tape efficiency. For a more complete list, see the above table. q
High end adopts disk-based backup
According to EMC, the transition from tape to disk-based backup is well under way-at least in high-end environments.
"People are doing disk-based backup initially for the data they consider to be most mission-critical," says Jim Rothnie, chief technology officer at EMC. Early adopters include the financial and telecommunications industries, for which system downtime and lengthy restores can have dire consequences, he says.
Financial industries have to keep their systems up all the time, as do telcos, which rely on their billing systems for a constant revenue stream, explains Rothnie. One EMC customer-a large bank-is backing up its e-mail system to disk. "They consider it mission-critical enough to not be willing to take a long outage for the restoration to tape," says Rothnie.
Although disk-based backup is occurring first in high-end environments, that doesn't mean the midrange is immune, says Rothnie. "As soon as users begin to treat backup applications as mission-critical, they won't be able to afford to go to tape." The use of disk-versus tape-depends largely on the value placed on the data to be backed up, how quickly that data needs to be restored, and a company's IT resources.
What lies ahead?
Recognizing the importance of maintaining tape's competitive edge in certain applications, vendors are working on a variety of technologies that will boost capacity and performance.
For example, Sony last month announced a high-density metal evaporated tape that, when used with a highly sensitive giant magneto-resistive head, will provide up to one terabyte of capacity in an 8mm form factor. Sony hopes to integrate the technology into tape drives in the 2003 to 2004 time frame.
Meanwhile, Sony continues to work with Compaq, Hewlett-Packard, and Veritas to promote the Auxiliary Memory (AM) content and interface standard. The specification, which was approved by ANSI last fall, is expected to have a significant effect on tape access rates over the next 12 to 18 months, says John Woelbern, senior marketing product manager at Sony Electronics.
Sony's AM implementation is based largely on its Memory-in-Cassette architecture, which was introduced with the first Advanced Intelligent Tape (AIT) drive in 1997. AM provides a direct connection to a tape drive's on-board processors, which enables quick media loading, fast file access, and multiple on-tape load and unload points.
StorageTek also plans to improve tape drive performance. This spring, the company will introduce the 9840B drive, which is expected to have a super-fast 19MBps transfer rate.
Library market remains healthy
According to Freeman Reports, tape library shipments will approach 99,000 units in 2004, up from 51,500 units in 1998, while revenue will nearly double to $4.2 billion (see figure). This translates into an annual growth rate of 15%.
International Data Corp. also projects strong growth for the industry. "Just look at the rate of revenue growth last year," says Bob Amatruda, senior research analyst at IDC. He points out that ADIC, Overland, and Qualstar, for example, each reported annual growth rates in excess of 40%.
DLT and 8mm tape formats account for the lion's share of library shipments, with more than 90% of total shipments, according to Freeman Reports. Meanwhile, half-inch tape libraries, despite falling revenues, continue to account for about 50% of total library revenues.
A storage integrator's perspective
By Robert Waldron
With the cost of disk continuing to decline and the physical size of disks shrinking, one can logically see where tape may go the way of the punch card. Data replication via disk has taken off with huge market acceptance. In fact, in a recent survey conducted by Articulent and Reality Research, nearly 50% of large organizations (50+ TB) said they were currently leveraging remote sites for data copy and replication.
Over the short term (one to three years), we expect new backup methods-primarily remote data replication to disk-to affect the tape market. More and more companies have started to replicate several days' worth of data to disk for quick recovery. Assuming disk prices continue to fall, we expect even more companies, where downtime is extremely costly, to replicate upwards of 30 days' worth of snapshots to off-site disk.
Farther out (three to five years), we expect users to begin shifting from tape to disk for data archival. And when disk storage costs fall to a couple of cents per MB, there will be an enticing market for credit-card-sized 100GB hot-pluggable disk drives in some traditional tape applications.
There is one additional variable to keep in mind-the end user. Every new product and/or backup method will have early adopters who are willing to take risks. On the other end of the spectrum, there are laggards and those who will refuse to adopt new products and methods for various reasons (e.g., cost and comfort level).
When considering all these factors, we believe comments to the effect that tape is nearing its demise are greatly exaggerated. That said, tape is evolving as a backup-and-archival tool.
Robert L. Waldron is director of managed services at Articulent Inc. (www.articulent.com) in Hopkinton, MA.
Tape backup will definitely be affected over time by new backup methods based on disk. Today, most companies back up large amounts of data to tape each day, with the expectation that should they ever need to restore a file, user directory, volume, database, or table space they will be able to. Typically, companies are looking for this type of recovery granularity for up to 30 days.
In addition, most companies are taking the extra step of cloning or duplicating backup tapes so that a set can be taken off-site and stored. Many users are already replicating some, most, or even all of their data to remote facilities once a day using snapshot technologies, and some companies are starting to keep multiple snapshots to help expedite data restoration. Not too far down the road, we also expect to see companies with large mission-critical applications keeping upwards of a month of snapshots.
In general, the benefits of data replication versus tape backup include
- Improved performance via snapshot copies;
- Faster restore times; and
- No second disaster-recovery copy of data (the snapshot is already moving the data to a remote site).
On the downside, data replication increases costs significantly. There is the added cost of disk at the remote site, not to mention the cost of bandwidth between sites. However, for many companies with mission-critical applications, these expenses are nothing compared to the cost of downtime. When all is said and done, it comes down to a simple cost/benefit analysis of the two backup processes: disk vs. tape.
As long as the government mandates how long data must be kept, there will be a need for tape archival. For example, the Internal Revenue Service regulates how long accounting records need to be retained, the SEC sets regulatory requirements for securities trading data, and the FDA mandates that store drug testing and lot control information be stored indefinitely.
Let's look at a quick cost comparison between disk and tape for archiving 100GB of data. Using an average cost of $0.15 per MB for this example, 100GB of disks storage costs approximately $15,000. The cost for a 100GB Super DLT or LTO cartridge is approximately $100 in this example. That's $0.15 per MB for disk versus $0.001 per MB for tape.
Even the promise of a yearly 30% drop in storage prices won't bridge the gap between disk and tape in long-term data archival business. Tape manufacturers will continue to drive prices lower, while improving drive/automation performance.
Next-generation tape products
- New formats
- Ultra-thin media substrates
- Very high areal densities, data rates
- Smart robots
- RAID, RAIL
- Virtual tape
- Intelligent cartridges
- SAN-ready libraries
- Rapid recovery
- Logical WORM
- 6Gbit/in2 by 2007
- New tape geometry?
- Random access?
- Price: less than $.001/MB
- Source: StorageTek