SAN backup bypasses LANs and servers
LAN-free and server-less backup may not be "killer apps," but they are driving forces with immediate benefits for backup operations.
If your backup environment is buckling under the strains of explosive data growth, a shrinking--or nonexistent--backup window, E-commerce, E-mail, and Web-hosting, you`re not alone. In most cases, traditional local area network (LAN) architectures do not have the bandwidth to support today`s heavy storage and general-purpose computing traffic.
"What we are seeing in most environments is that the LAN is becoming a huge bottleneck," says Scott Robinson, vice president of engineering at Datalink, an independent provider of networked storage solutions. "Data is typically growing between 80% and 100% per year. At the same time, applications are becoming more mission-critical and just about every one is 24x7."
Combined these factors can throttle your system`s backup performance, says Tom Petrocelli, Fibre Channel products manager at Atto Technology. "Backups tend to drag down overall performance in LANs because of the large amount of data that is streaming through the LAN all at once," he explains. "This reduces performance for everyone on the network that the tape system is attached to, reducing overall system efficiency." Secondary choke points include the file and tape server, and even the tape system itself.
Until recently, IT administrators have grappled with increasing backup loads in two ways, by either implementing a separate, dedicated LAN (or storage "subnet") or directly attaching SCSI storage devices (tape and disk) to servers. While both approaches direct backup traffic away from the LAN, each has significant drawbacks. LAN subnets--even those configured with Gigabit Ethernet--are not designed to stream large blocks of data, so data movement is slow and cumbersome. Direct-attached storage, on the other hand, can be cost-prohibitive.
"Direct-attach SCSI enables you to move data at SCSI rates, but you`re left with a hard link between a server and a tape drive," explains Robinson. In large environments, this means you either end up using a lot of extra tape drives and libraries to accomplish your backup or you wind up only directly backing up your largest servers, routing the rest of your backup traffic over the LAN. This is not only a costly proposition, but it also affects system performance since a lot of IT assets are then unnecessarily tied to backup operations.
Fibre Channel storage area networks (SANs) address both the performance and cost issues associated with these traditional backup approaches. In its initial implementation, a Fibre Channel SAN enables high-speed LAN-free tape backup. A typical configuration includes a tape server, a tape library, a bridge (to connect SCSI libraries to the Fibre Channel network), and disk-based storage in a separate Fibre Channel network. Data is streamed in blocks over the Fibre Channel network from disk storage to the server and back out to the library. The LAN is used exclusively for messaging traffic.
"The advantage of a LAN-free [SAN] backup approach is increased throughput to the tape device, hence shorter backups," says Atto`s Petrocelli. Overall performance can be improved by 2.5 to 10 times, or more if tape RAID software is implemented, he says. And the faster the backup, the shorter the backup window.
In sum, LAN-free backup in a SAN environment has the immediate effect of:
Removing backup traffic from the LAN.
Reducing loads on the LAN or on application servers during backups, enabling higher overall service levels in the LAN and better application response times.
Speeding up the backup process.
With the right software (e.g., CA ArcServeIT, Legato SmartMedia, Veritas Networker/ Backup Exec), a LAN-free SAN also enables you to share storage resources--whether that means pooling drives for a particular backup job or sharing libraries among multiple backup applications. "It`s a real application that solves backup problems," says Steve Whitner, vice president of marketing at ADIC.
Resource sharing, in its various forms, enables you to optimize the use of storage assets and to configure backup environment to current conditions. (For more information on resource sharing and shared storage options, see "Maximizing resources through tape sharing," InfoStor, May 1999)
The ability to pool drives is the most significant advancement in LAN-free backup, says Don Kleinschnitz, vice president of storage market alliances and head of SAN operations at StorageTek. What that means is that you can a run a backup to any configuration of drives per server any time of day. If a drive fails, you can swap in another drive. If you want to expedite a backup, you can connect several drives on a loop.
"The net result," says Kleinschnitz, "is half or less the number of drives doing the same amount of work with improved flexibility. And it gives you the opportunity to run expensive libraries all day--not just during a one-hour window--without impacting the client environment."
In this setting, the backup window can become a non-issue since storage data runs over a separate storage network and resources are assigned to servers on an as-needed basis. At the library level, resource sharing can enable "lights-out" backup.
"If you do that, you can theoretically go to an operator-less backup at night where you don`t have people changing tapes. You can automate the whole process via software and hardware," adds Steve Richardson, vice president of marketing at Overland Data.
At this level, resource sharing translates into significant cost savings and virtually "window-less" backup. However, system performance is still impacted--albeit less significantly than it is in a straight LAN environment--since backup data still flows through the server.
"You still have to move data back over the bus and out," says Robin Purohit, director of SAN products at Veritas. "That causes CPU utilization and takes up bus cycles. Now, we are looking for ways to take up less CPU cycles and less I/O bandwidth during backups so it interferes less with service applications." This capability has particular implications for database applications. By "removing" the database server from the data path, the database can continue to accept transactions while the backup is in process with minimal effect on server performance.
Called "server-less backup" by some and "third-party copy" by others, this new architecture maximizes SAN bandwidth by enabling direct storage-to-storage data movement (e.g., from disk to disk, disk to tape, or tape to disk). In this environment, the server acts as a system coordinator, while a "copy device," typically a bridge, actually moves the data between storage devices as directed by special control software (e.g., Legato Celestra, though IBM and Veritas are reportedly developing their own capabilities as well).
It`s a three-tier architecture (see figure), explains Nora Denzel, a senior vice president at Legato. Data movement is initiated by any NDMP-compliant backup product (the top tier), such as EMC EDM, Legato BudTool, Networker, and SmartMedia (support expected this month), or Veritas NetBackup. Coming from the LAN environment, NDMP is an open interface that uses IP signaling and protocols to communicate with servers.
The control software then synchronizes the data and does the logical-to-physical mapping by sending a block list to the copy device (the third tier). In addition to needing enough power and memory to support the movement of large blocks of data, the copy device must support connections to other devices, in this case disk drives and tape libraries, adds Atto`s Petrocelli. Today, that interface is the industry standard "third-party copy."
In simpler terms, the backup application talks to NDMP, NDMP talks to Celestra, and Celestra talks to the "third-party-copy" device, which in turn moves the data, says Denzel.
The immediate added benefits of a server-less backup environment include:
Higher performance. Since the server is no longer a bottleneck, throughput is now limited by the speed of the storage devices themselves, not the process- ing power of the server. In LAN-free backup, the backup server`s performance is directly related to memory, I/O, and CPU of the backup server itself. This translates into improved user productivity and higher service levels.
Less overhead. Comparatively inexpensive copy devices are used to move data, rather than high-end servers. Additionally, a dedicated tape server isn`t required since the Celestra agent can share space on another server.
According to Petrocelli, the architecture also makes it possible to stream the same data to several tape libraries at once--even if they are geographically separated--without having to copy or move actual tapes. On the downside, server-less backup necessitates both a RAID- and tape-equipped SAN.
"And that`s really not that common," says Veritas` Purohit. "A lot of people are implementing a Fibre Channel hub or switch just to attach RAID or just to attach tape. They`re not really on one big SAN yet."
Until that happens, the applications for third-party copy are fairly limited, and the benefits just as easily realized via clustering, notes Purohit. "If you are using clustered servers--so two servers are sharing the same database--you can off-host backup without using this new technique."
Unlike LAN-free backup, which is widespread in terms of vendors who have either announced or demonstrated the capability, server-less architectures are just emerging (see sidebar "Preliminary server-less products").
LAN-free and server-less backup are just the first two chapters in the evolving SAN story. Are they "killer apps"? Probably no more so than file and print sharing were to the LAN story, says Paul Mason, an analyst with International Data Corp., a market research firm in Framingham, MA.
"Just being able to connect different storage devices onto one subnet and being able to get access to them and to move data from one to another--if not cumbersomely--is not exactly a killer app," says Mason. Killer apps, he says, enable users to manage storage at a much higher level of abstractions, to relate storage requirements to application, and to set policies.
Nonetheless, LAN-free backup SANs are being deployed. Datalink, for example, says it has implemented a handful, with a couple dozen more in the planning stages.
"The majority of our customer base is talking about SANs. It`s no longer a matter of if our customers will implement one, but when," says Datalink`s Robinson. He says that even organizations with no immediate interest in implementing SANs are laying the groundwork for them with Fibre Channel hardware.
"I doubt whether you would implement a SAN for the specific purpose of implementing LAN-free backup, but I think you would start migrating to a SAN because it`s a good long-range solution," adds Mason.
Additionally, because LAN-free SAN backup architectures leverage existing assets, the cost of implementing entry-level SANs are kept low, says Petrocelli.
"LAN-free backup can be viewed as an upgrade to an existing tape storage sub-system, rather than as an entirely new installation."
1) Backup traffic travels over the LAN, impacting both server and network performance.
2) Backup traffic is routed over a storage area network, relieving LAN congestion, but still affecting server I/O.
3) Backup traffic goes directly from disk to storage for optimum server/network performance.
Server-less backup requires a NDMP-compliant backup application, special control software, and a copy device.
An alternative approach to eliminating the backup window
An alternative way of tackling the backup window problem--one that doesn`t require a SAN--is via products such as Network Integrity`s LiveVault software. Released last spring, LiveVault transforms backup into a continuous real-time operation. After an initial "full" backup, data is backed up as it changes in increments. Because backups are performed at the "byte" rather than the "batch" level, they don`t have to be squeezed into a set window.
Instead of trying to copy, say, an entire 100GB Exchange database at night that changed in small increments throughout the day, explains John Butler, Network Integrity`s CEO, changes are backed up throughout the day, with little effect on system performance. "With LiveVault, you stop asking the question, are we done with backup?" says Butler. "It`s just another on-line process."
LiveVault features a three-tiered hierarchical storage archive, consisting of on-line disk, near-line tape, and shelf storage, and Time-Slice Recovery, which enables you to recover data by the last-known good version. Also, because LiveVault puts little load on the network, the backup can be run over a WAN or the Intranet, which has implications for replication and disaster recovery operations. On the downside, LiveVault works in NT environments only. The cost: $3,000 for the LiveVault storage server, $2,000 for each additional server that is backed up.
For Reno, Nevada-based financial services company Nevallier Financial Corp., LiveVault has almost eliminated its backup headaches. "You can set it and forget it," says Patrick Seeber, Nevallier`s chief technology director. Its LiveVault environment includes four NT servers--housing critical research and information systems` data--and a seven-cartridge ADIC FastStor 4000 library. However, the company still makes a duplicate copy on optical. "It`s a safety net. You can never have too many," he cautions. Seeber says he hopes to wean Nevallier off optical within the year and will start talking SANs next year.
As for Fibre Channel and SANs, "what we say is we love them, but they`re not really a fundamental solution to the backup window problem," says Butler. "You still have to copy that 100GB file every night, so you still have the problem of getting it done in time. It`s the law of constant backup."
However, as SANs increase in popularity, fairly specialized approaches like LiveVault and EMC Data Manager (EDM) run the risk of staying niche, says Paul Mason, an analyst at International Data Corporation. In these environments, users would run "SAN-enhanced" software from vendors such as Computer Associates, Veritas, or Legato.
In July, EMC announced several EDM enhancements, including NT support for high-end Symmetrix Connect users, compatibility with ADIC Scalar 1000 and 218 DLT tape libraries and IBM Magstar and Quantum DLT tape devices, and partitioning of Sony Petasite libraries.
EDM is available in three configurations: EDM Symmetrix Connect provides LAN-free and server-less backup and recovery of Symmetrix-resident NT data and large Unix databases at gigabyte speeds; EDM Symmetrix Path, for LAN-free backup at server-channel speeds; and EDM Enterprise Network, for on-line backup in Unix, NT, Novell, NetWare, and IBM OS/2 environments.