Data Domain boosts de-duplication performance

Posted on May 13, 2008

RssImageAltText

By Kevin Komiega

—Data Domain has moved from dual-core to quad-core processors to deliver a new data de-duplication system that boasts a 75% boost in performance, more capacity, and extended support for remote office replication.

This week the company announced the DD690—an enterprise-class de-duplication system with aggregate throughput up to 1.4TB/hour, and single-stream throughput up to 600GB/hr, to enable protection of large databases in short backup windows.

A fully configured DDX array with 16 DD690 controllers increases aggregate throughput up to 22TB/hr with up to 28 petabytes of capacity.

Data Domain's de-duplication systems serve as a disk-based alternative to tape that use existing backup or archive software and eliminate redundant data to reduce capacity requirements for extended local retention or replication across a WAN for disaster-recovery purposes.

The DDX Array Series is available in four-, eight-, and 16-controller configurations, uses integrated or third-party external storage, and is designed to provide nearline storage for data centers with at least 20TB of application data.

Data Domain achieves high-speed de-duplication by doing most of the heavy lifting with processors rather than disk drives. "We use a CPU-centric approach to performing inline de-duplication, which means our performance does not rely on spindles and disk access speeds," says Brian Biles, vice president of product management at Data Domain. "We don't need that many disk drives to go faster."

Biles claims that the DD690 is not only bigger and faster than the company's previous flagship product, the DD580, but it also supports more remote sites by using the Data Domain Replicator Software option. The replication software automates vaulting across WANs for use in disaster-recovery processes, remote office backup, or multi-site tape consolidation.

"We sell a lot of our DD120 appliances to remote offices that back up data locally and then remotely replicate that data back to a hub. While our prior system could do a 20:1 fan-in, the DD690 can do 60:1," claims Biles.

The DD690 can de-duplicate globally across remote sites, further minimizing required bandwidth, because only the first instance of data is transferred across any of the WAN segments.

While the total aggregate throughput of the DD690 is on par or slightly better than competing products on the market, Data Domain claims that the system is the fastest inline de-duplication system in terms of single-stream performance.

Enterprise Strategy Group analyst Heidi Biggar says there is an important difference between single-stream performance and aggregate performance, especially when protecting large databases that can't be broken into pieces to back up. "The higher the single-stream performance for these types of backup jobs, the better," says Biggar. "With the DD690, Data Domain can reportedly hit speeds of up 600GB/hr [or 166MBps] with a single node."

Biggar says end users need to keep performance in mind when they're evaluating de-duplication technologies. "Users need to make sure they're making apples-to-apples comparisons and that it is clear whether the performance is aggregate or single-stream," she says. "Aggregate performance is important too, but for some backup jobs users should look at single-stream speeds."

The DD690 is compatible with a range of backup software products from vendors such as Atempo, BakBone, CA, CommVault, EMC, Hewlett-Packard, IBM, Microsoft, and Symantec, and supports Fibre Channel and Ethernet storage fabrics with a new 10GbE option.

The DD690 will be available as an appliance, an array, or as the DD690g Gateway, which supports external SAN array storage. A minimum configuration of the DD690 includes 16TB of raw storage capacity using 500GB disk drives at a starting price of $210,000.

********

For related articles, see

IBM acquires Diligent for de-duplication

VTLs with de-dupe produce real ROI

FalconStor ships clustered VTL with de-dupe

HDS launches pre-configured VTLs

Data de-duplication: Questions and answers

VTL delivers real performance with virtual reels


Comment and Contribute
(Maximum characters: 1200). You have
characters left.

InfoStor Article Categories:

SAN - Storage Area Network   Disk Arrays
NAS - Network Attached Storage   Storage Blogs
Storage Management   Archived Issues
Backup and Recovery   Data Storage Archives