VTLs with de-dupe produce real ROI

By Kevin Komiega

The rapid maturation of virtual tape library (VTL) technology-specifically, the integration of data de-duplication into many of the systems-is elevating VTLs from platforms for speeding backup-and-restore operations into a source of significant cost savings in the data center.

In a recent report entitled, "Enterprise Open Systems Virtual Tape Libraries," Forrester Research analyst Stephanie Balaouras explains that VTL vendors-including Copan Systems, Data Domain, Diligent Technologies, EMC, FalconStor, Fujitsu Siemens, Hewlett-Packard, IBM, Network Appliance, Quantum, Sepaton, and Sun-are already offering data de-duplication or will soon offer it as an integrated feature within their respective disk appliances and VTLs.

According to Balaouras, VTLs with compression and data de-duplication capabilities not only help improve capacity utilization and lower capital expenditures on disk, but they can also help some enterprises store more data longer before it is vaulted to tape for long-term storage. And, as shown in the following case studies, today's VTLs can deliver a rapid return on investment (ROI).

Trial by fire

Vistage International is an organization that helps chief executives advance their careers and their companies through collaboration, learning, and mentorship. Over the past half-century, the company, formerly known as The Executive Committee, has nearly doubled in size every five years to become the world's largest CEO membership organization. The San Diego-based company has 14,000 members in 16 countries and provides access to a network of experts, CEO peers, executive learning workshops, and other resources via the Internet.

"Content is king when you're striving to share best practices and leadership perspectives," explains Carlo Saggese, Vistage's vice president of application development. "We've created a massive website for our members so the information they need is readily available at their fingertips."

Vistage International's 15-person IT department supports 170 employees and a range of technology projects, from sales force automation and content management to ensuring disaster recovery. The team oversees approximately 75 virtual and physical Windows file servers, several QuickTime servers, as well as a Net-App filer and an EMC SAN for housing 18TB of data.

When Saggese joined Vistage a little more than a year ago, he was saddled with an aging tape library with AIT drives and a backup window of more than 36 hours.

"Not everything was being backed up on a regular basis because we just didn't have the bandwidth to do it all," says Saggese. Restoring as little as 20MB of data could take up to an hour and, if data were off-site, recovery could take as much as two days. Saggese adds, "We're a 24x7 shop and have to service our members via the Web. It was a major issue."

Saggese heads up Vistage's applications group and, as such, has a hand in the decision-making process as it relates to the backup infrastructure. New projects dictated that a new backup system be put in place.

"We provide a lot of video and audio content via the Web. One of the things we wanted to do was digitize and metatag all of our content. The project is going to take a couple of years, but early on we realized we needed data de-duplication," says Saggese. "Once we digitize and catalog all of our movie files we estimate that we'll have about 50TB of data."

The plan was to upgrade the entire storage infrastructure to improve disaster recovery, reduce backup windows, speed data recovery, and boost performance. Vistage decided to replace its tape-based systems with a disk-to-disk-to-tape (D2D2T) configuration and establish a remote site to support fail-over if the primary location went down.

Vistage opted for Overland Storage's REO 9100c disk-based backup appliance with hardware compression and its NEO 4100 midrange LTO-4 tape library. The REO 9100c can be configured as multiple VTLs or a mix of stand-alone virtual tape drives or disk volumes (LUNs) for up to 114TB of usable virtual tape capacity via optional expansion arrays. The NEO 4100 features up to 96TB of capacity and more than 3.5TB per hour of performance.

In late 2007, Vistage installed its Overland REO 9100c and NEO 4100. One week later, its new D2D2T solution was tested when the worst wildfires in California's history hit San Diego, forcing the evacuation of half-a-million residents and businesses.

Since the new backup systems allowed Vistage to complete its weekly full disk backup by Sunday, the company used Overland's WebTLC feature to initiate a disk-to-tape process remotely. As a result, a full set of backup tapes were available first thing Monday.

Once the smoke cleared, Vistage completed its deployment with Overland's new REO 9500D de-duplicating VTL appliance to gain further reductions in data backup-and-recovery times, as well as long-term retention costs.

The REO 9500D uses Diligent Technologies' ProtecTIER data de-duplication technology to allow end users to typically retain 25x more backup data on their VTL before moving the data to tape. The 9500D supports up to 281TB of usable capacity and moves data at up to 200MBps.

"We want to store up to three months of data on the REO 9500D," says Saggese. "So far, we're achieving data de-duplication ratios of 10:1 and better, and we're confident that ratio will increase as we perform more backups. The result is we'll be able to meet our retention objectives, reduce our reliance on off-site tapes, and improve overall customer service levels."

Data Domain claims that its inline de-duplication process reduces the amount of backup data by 20x on average, enabling cost-effective on-site retention and off-site vaulting for disaster recovery.

Saggese says that the VTL with data de-duplication technology has helped accelerate backups and restores, retain months of disk-based data, and simplify the archiving process-all while reducing costs and lowering administrative overhead by more than 25%.

Longer data retention

Keeping data alive on spinning disk for longer periods is not just a nice thing to have. It is a common requirement and can be a major factor in the decision-making process.

Sunrise, a telecommunications company based in Switzerland, provides mobile, fixed network, and Internet services to business and residential customers. The Sunrise IT organization is split in two groups: One side supports IT operations and business applications such as billing and provisioning, and the other supports business operations, providing network and phone services to Sunrise's customers.

"We started to de-duplicate data on the operational side of the business in order to keep data available for longer periods of time without having to go to our archive," says Sandor Orban, a senior IT specialist at Sunrise. "De-duplication allows us to extend the lifespan of data and make it instantly available at high-performance levels."

Sunrise currently keeps three months of data available and recoverable at the click of a mouse. "With tape drives, availability was never quite guaranteed. With tape, you tend to multiplex on the backups to get the best performance, but that slows down recovery times because it is necessary to read tapes. The advantage of VTLs is much better recovery times because you don't have to scan through tapes," says Orban.

Sunrise began its journey into the world of de-duplication when the company was looking for a solution for backing up its NAS servers, which are the foundation for customer e-mail services.

"We were looking for a solution that would allow us to provide complete backups of our NAS systems within a 24-hour period. We were backing up millions of 1K e-mail files, and because of the design limitations of the NAS solution we had to provide four tape drives per filer," says Orban.

Sunrise was running 10 NAS servers and 40 tape drives. "It was very expensive and hard to implement. We found that we could implement a similar solution using FalconStor's VTL product, and it would allow us to consolidate all of our backups onto one server," says Orban.

The move to FalconStor's VTL platform meant a change in licensing. Sunrise was now subject to a capacity-based license for the VTL rather than the per-drive scheme required with its previous tape-based architecture.

"We found this to be very practical and flexible from a cost-savings perspective and decided to migrate the network side of our business to the VTL as well. That translated into very big cost savings," says Orban.

FalconStor's VTL technology uses high-speed disk to provision virtual tape drives and libraries to servers via iSCSI or Fibre Channel. The VTL is compatible with popular backup software packages and does not affect existing storage configurations.

FalconStor's approach to de-duplication is its Single Instance Repository (SIR), which increases nearline storage capacity while reducing storage costs and bandwidth needs for off-site replication.

The SIR uses the post-process approach to de-duplication, eliminating redundant data without affecting the backup window.

Sunrise is experiencing data de-duplication ratios ranging from 25:1 all the way up to 40:1. The ratios fluctuate depending on the data types being processed by the system. "We tend to get the best de-dupe ratios with operating system data and user files," says Orban.

Aside from simplified licensing and reduced capacity requirements, there are other ways companies can use VTLs with data de-duplication to squeeze more out of their IT budgets-namely, reclaiming unused disk capacity.

The Arizona Republic has published a daily newspaper in Phoenix for more than 110 years. Now a full-fledged multimedia company, the Republic has expanded its operations to include providing news, information, and video via the Web to its 1.5 million readers.

Expanding beyond the world of newsprint has presented a number of IT challenges for John Taber, the Republic's principle systems administrator.

"What's important to me is that our backups run," says Taber. A simple statement, but easier said than done.

Most of the Arizona Republic's data is on Microsoft SQL Server, Sybase, or Oracle databases. The company also has approximately 600 servers, most of which are Windows-based. The paper's consumer-facing Internet portal is AZCentral.com, through which readers access video and news information supported by 14TB of file data maintained by Taber and his team.

The Republic is an EMC shop. The storage infrastructure consists of EMC Clariion and Symmetrix storage arrays and an EMC Clariion Disk Library (DL). It did not take long for Taber to realize that additional disk capacity would need to be purchased as the Republic neared the capacity limits of the DL.

"We quickly exceeded our initial 6TB of space and the system maxes out at 12TB," explains Taber. "We were looking at buying another 4TB of disk and possibly performing a forklift upgrade to scale up."

That's when Data Domain came knocking on his door.

"Data Domain delivered a box that had 5TB of capacity. It was faster than they said it would be and did everything they said it would do," says Taber.

Once the decision was made to implement Data Domain's system as a de-duplicating VTL, using existing disk capacity was of major concern. The Republic already had 25TB of capacity that was being used for data retention and disk staging. Taber wanted to save money by sidestepping the need to buy more disks. "We were interested in what Data Domain could do because if they could take that raw capacity and multiply it by 20, or even 10, I could conservatively expect about 150TB of disk-based data retention," he says.

"There is major return on investment with data de-duplication. Now we don't have to buy that additional disk because we solved the issues around maxing out our Disk Library by eliminating redundant data and utilizing what we already had," he adds.

This article was originally published on March 01, 2008