Virtualization technology will encompass many applications, one of the most significant possibly being backup/recovery.
By Don Trimmer
Any discussion of storage virtualization should start with a definition. Unfortunately, there is no clear definition for storage virtualization or virtual storage. There are many vendors announcing virtualization products and making conflicting claims as to exactly what virtualization is. "Storage virtualization" has been defined as everything from "a logical volume manager for a storage network" to "a front-end cache for a tape drive." Other vendors add functionality such as provisioning, volume mirroring, snapshots, and/or distributed file systems to their definition. While most vendors claim that storage virtualization should be built on top of block-based storage area networks (SANs), other vendors can make equally valid claims that virtualization should be built on top of file-based networked-attached storage (NAS).
This is the same situation that end users experienced a few years ago when storage area networks (SANs) were the catch phrase of the day. We have yet to see an explicit and agreed-upon definition of a SAN, and the same can be said for storage virtualization. However, while vendors may not be able to agree on a definition for virtualization, there is still significant value in many of the new storage virtualization products.
Storage virtualization may ultimately impact all facets of storage management, and one of the most significant areas to be impacted may be backup and recovery.
For at least the last 15 years, every major advance in storage technology has been hailed as a replacement for traditional backup and recovery. These advances have included RAID, mirroring, remote mirroring, snapshots, and SANs. Although none of these technologies eliminate the need for traditional backup/recovery, each has allowed some improvement.
Current backup/recovery methods
A well-executed backup/recovery scheme will protect against data loss due to hardware failure, software bugs, human error, hackers, saboteurs, and natural disasters. Backup can also be used for data archival. The necessary recovery services include cataloging of the objects (usually files and directories) available for recovery, the ability to perform fast and simple recovery of a few objects, and the ability to perform high-speed reconstruction of large data sets. The backup/recovery system must accommodate off-site data storage and long-term data storage. Finally, the backup/recovery system must provide specialized services for some applications such as databases. New methods of data protection cannot replace the old methods unless all of the above issues are addressed.
Today, well-designed backup/recovery systems use multiple technologies for data protection. RAID and mirroring technologies protect against local hardware failure. However, they cannot protect against human or software errors, since any errors will immediately be propagated. Point-in-time copies are the only available method for protection against human or software errors.
Point-in-time copies must be made at least daily and ideally, much more frequently. Traditionally, point-in-time copies meant backing up to tape. However, disk-based technologies such as detached mirrors and/or snapshots may also be used for some point-in-time copies. The disk-based technologies have the advantage of being very fast compared to tape-based backup. However, there is currently very little automation for disk-based technologies and no method for generating catalogs of saved objects. Also, an enormous amount of disk space would be required for long-term storage. These disk-based technologies are often used in conjunction with tape-based backup but are not currently a viable replacement for it.
Storing removable media off-site and/or remote mirroring are at the core of disaster-recovery plans. Remote mirroring has the advantages of maintaining a remote copy of the data that is up-to-date and can be accessed very quickly. Remotely storing removable media has the advantage of being much less expensive than the purchase of long- distance bandwidth required for remote mirroring.
Disk vs. tape: The battle is on
The last decade has seen many incremental improvements in backup/recovery technologies. This decade may see a revolution. One significant event is the dramatic decline in the cost of disk drives. The figure shows the cost trends for tape drives, tape media, and hard disk drives. By the end of 2002, a hard disk drive may cost substantially less than a piece of tape media with equivalent capacity. The gap between hard disk drive and tape media costs will widen over the next few years. While a low-cost ATA disk system may not have performance characteristics suitable for use as a primary data store, its performance characteristics are ideal for backup/recovery operations. However, while the economics may favor disk over tape, a transition cannot occur unless hard disk drives are able to provide all of the functionality that tape offers. Further, complete and well-integrated disk-based backup/recovery solutions must be available.
An exhaustive comparison of hard disk drive vs. tape technologies is beyond the scope of this article. But, it is useful to summarize those aspects important for backup/recovery. Disk systems using RAID technology are more reliable and have better performance than tape devices and media. A disk-based system will have a smaller footprint than an equivalent tape library. If the disks are spun-down when not in use, a disk system will also consume less power than a tape library.
However, for every piece of tape media inside a tape library, there are probably 10 pieces of media sitting on a shelf. Therefore, support for removable media and long-term storage is also important. Current SCSI and Fibre Channel disk drives are not well-suited for use as removable media. When powered down, the disk heads rest on the disk surface. As time passes, the heads tend to adhere to the media, making a future spin-up problematic. These drives are also not designed to survive the shock and vibration that tape media may experience during human handling.
Current ATA disk drives have special landing zones for the heads, allowing them to be spun-down for extended periods and then spun-up without a problem. The 3.5-inch disk drives still suffer from the same relative fragility as SCSI and Fibre Channel disk drives. However, 2.5-inch ATA disk drives (found in notebooks) have been designed for relatively high levels of shock and vibration. One disk manufacturer has a 2.5-inch disk-based removable canister rated for shocks up to 1,000Gs, and 100GB 2.5-inch disk drives will be available this year. One of these drives in a canister, or several drives and a RAID chip in a canister, may be an appealing choice for removable media.
Hard disk drives are also less susceptible to humidity, temperature, dust, and other environmental factors. Disk media formulations are also more stable than tape media formulations. Therefore, a hard disk drive can be an excellent choice for long-term data archival.
Finally, the cost of storing a disk drive in a low-cost, powered-down chassis will not be substantially more than the cost of storing that same disk drive or a piece of tape media on a shelf. When we consider the cost of handling the media and the attendant opportunity for human error, it will be much less expensive to leave the disk mounted in a chassis. Removable media will still be necessary for moving large amounts of data to remote locations. However, media will no longer be removed to make more room in a tape library or its disk equivalent.
The new generation of hard disk drives offers a price point and design characteristics that could make them an excellent replacement for tape technology. However, merely hooking up a new disk subsystem does not magically implement a backup/recovery system, so the next question is how to make the best use of these disk drives.
There will soon be a bewildering array of disk-to-disk backup products. Every vendor will claim they have finally solved the backup/recovery problem. Let's examine a few of the approaches that have been announced or are currently shipping.
Current backup/recovery applications: Existing backup/recovery applications have been designed and are highly optimized for backing up to tape. However, most of these applications also have the ability to back up data to a file system. It is a common practice to perform weekly full backups to tape and write daily incremental backups to a file system. This works well for relatively small and inactive data sets. However, this capability is usually not well integrated with media management policies, has little exception handling and error recovery, suffers from low performance, and does not scale well. Current backup/recovery applications will require major modifications before they enable a tape-less environment.
For example, let's say a library with one LTO tape drive and 10 slots has a capacity of about 2TB and a transfer rate of about 30MBps. Construction of a 1TB+ file system that will support 30MBps sustained read/write speeds would require a high-end server and a high-end disk system. The occasional file system check would also make the backup system unavailable for extended periods of time. It is very difficult to replace the performance and capacity of the smallest tape library with a traditional file-system-based backup solution.
New backup/recovery applications: A new class of backup/recovery applications is beginning to appear. These applications are designed to take note when an object has been modified and to copy the new object to an alternate location. The new location may be a file system or a database. Some products will store the entire object every time it changes. Other products store an initial copy of the object and only the differences for subsequent versions of the object. While these new applications seem promising, a lot of work remains before all of the functionality built into current enterprise backup/recovery applications is replicated.
Virtual tape: Virtual tape is sometimes mentioned as a component of storage virtualization. Tape virtualization is not a new concept; it has been available in mainframe environments for at least 10 years. While the name may suggest that virtual tape consists of using disks as replacement for tapes, it is rather a front-end cache for physical tape. Since the amount of disk required for a front-end cache is very small compared to the capacity of the tape media, the cost of the disk technology is of secondary importance. In a mainframe environment, the primary purpose is to allow better packing of data onto tape media by allowing multiple virtual tape images to be stored on a single piece of tape media. Open systems applications already do a good job of packing data onto tape media. Therefore, in open systems environments, tape virtualization is simply a high-performance front-end cache for tape. While this has value, it is not the revolution customers are looking for.
Virtual tape libraries: A virtual tape library (VTL) is a new class of device. Rather than acting as front-end performance caches, this type of product seeks at a minimum to relegate tape libraries to generating tapes for transportation to off-site locations and, ultimately, to completely replace tape technology. These disk subsystems offer substantial improvements in performance, reliability, footprint, power requirements, and ease of management of the backup/recovery software. Whether VTLs offer substantial improvements in backup/recovery systems will depend largely on how well they are integrated with backup/recovery software, human/library interactions, and existing media management infrastructure. VTLs could offer substantial storage management cost savings while integrating seamlessly with little or no disruption to existing environments. VTLs offer an opportunity to immediately reduce or eliminate reliance on tape technologies and to simplify backup/recovery configuration and administration.
Mirror and snapshot automation: To date, the relatively high cost of disk has severely limited the number of detached mirrors and/or snapshots that can be maintained. As the next generation of disk subsystems is deployed, it will be possible to purchase enough disk storage to store a very large number of detached mirrors or snapshots. However, there are four things that need to happen before mirroring and/or snapshots can be used as the only backup/recovery method:
- Mirror and/or snapshot management products must be available. It is not practical to manage the synchronizing mirrors, detaching mirrors, taking of snapshots, expiration, and mirror recycling, etc., by hand. Also, the product needs to be integrated with file systems and applications such as databases so that the data can be brought into a self-consistent state prior to detaching a mirror or taking a snapshot;
- The objects available for recovery need to be cataloged, and point-and-click file and directory recovery needs to be implemented. This requires knowledge of the disk-resident objects. Therefore, either storage systems will have to be file system- aware or application server-resident agents will be needed;
- A method for creating a detached mirror on removable media is required. There will also need to be a way to reassemble the volumes at a remote site. Since a relatively large number of detached mirrors and/or snapshots will be needed at any given time, it will still be expensive if every image is a full copy of the data; and
- Therefore, we will also need incremental and/or differential snapshots. This is a method of creating a new snapshot by storing only those differences between the new snapshot and some previous snapshot or detached mirror.
Assuming that all of this is implemented, an optimum solution for database backup/recovery will still be needed. The ability to recover full databases will exist; however, the ability to recover individual tables-a capability that currently exists when the database engine generates the backup data stream-is lost.
SAN file systems: There is renewed interest in implementing data sharing via a SAN resident file system. If storage virtualization and backup/recovery are added, we achieve the Holy Grail of storage: self-managed storage. This sounds great; however, the easier parts have existed for some time, and the harder parts are much harder than it would appear at first glance.
The basic idea is to run a file system on networked storage. Rather than exposing a block interface to application servers, the storage system would expose a file interface. But a traditional file server or NAS device also exposes a file interface (e.g., NFS or CIFS) to application servers. While it is possible to provide a higher level of data sharing than is currently supported through protocols such as NFS and CIFS, substantial improvements would require every application to be rewritten to take advantage of the new services.
Finally, it will still be necessary to integrate backup/recovery services with applications such as databases. Locating backup/recovery services and database appli cations on different platforms makes the integration more difficult. Implementing backup/recovery services in a distributed environment is also more complicated than implanting them on an application server. It is unlikely this type of system will be available in the foreseeable future.
Hard disk drive technology may begin to supplant tape technology sales over the next few years. This trend started several years ago when supplementing tape-based backup/recovery with detached mirrors and snapshots began. The big question is what methods of backing up to disk are viable in the near and far future.
Virtual tape libraries are currently available and provide a good way to reduce tape usage immediately. More-extensive use of mirror and/or snapshot technology depends on better automation and better integration with applications and file systems.
Don Trimmer is founder and chief strategic officer at Alacritus Software (www.alacritus.com) in Livermore, CA.
Coming in INFOSTOR...
Fibre Channel Update
Our Special Report will focus on all the latest developments in Fibre Channel, including the move to second-generation 2Gbps storage area networks (SANs), adoption of FCIP to link Fibre Channel SANs over IP WANs/MANs, declining SAN costs, and how disaster recovery is fueling interest in SANs.
- Testing Fibre Channel SANs
- Transport issues for IP-connected SANs
- Distributed "edge" storage overcomes bandwidth limitations
- SSPs address new services