Sepaton de-dupes data

Posted on July 01, 2006

By Kevin Komiega

Sepaton is continuing its quest to make disk-based backup cost- competitive with traditional tape libraries with the introduction of its DeltaStor software for the company’s S2100-ES2 Virtual Tape Library (VTL) appliance. The software “de-duplicates” files, thereby freeing up costly capacity and letting users store more backup data.

DeltaStor software locates previously stored versions of data on virtual tape cartridges in the VTL and compares them to the latest backup set at the byte level. New duplicate data is stored and old duplicate data is replaced with pointers to the newer data. For more efficiency, DeltaStor software can also be used with a software compression option integrated with the S2100-ES2.

Like other VTLs, Sepaton’s system emulates a physical tape library and integrates with existing storage infrastructures without requiring changes to the backup process. The S2100-ES2 scales from 4.8TB to 1PB and performs backups at up to 2400MBps, according to the company.

Some industry analysts consider data de-duplication to be one of the most important new technologies in the storage industry because users can make better use of disk capacity at a price comparable to that of traditional tape libraries. (For more information, see “Data reduction, VTLs, CDP drive NGDP,” p. 37.)

For example, the maximum retention time for 2.5TB of full daily backups on a 25TB system is only 10 days. According to Sepaton, DeltaStor software could potentially store as many as 250 days of 25TB backups in the same space while providing the performance benefits of disk-based backup.

The combination of DeltaStor software with the S2100 VTL could yield performance in the neighborhood of 8.6TB per hour, because the software removes redundant data outside of the primary data path. Enterprise Strategy Group analyst Heidi Biggar says this approach allows Sepaton to circumvent potential latency issues.

“A potential issue with doing de-duplication real-time is performance. Running algorithms as data is ingested introduces latency,” says Biggar. “Because of the latency associated with this type of object-by-object comparison, the process has to be done in the background after the backup job is complete; it cannot be done real-time.”

Biggar says Sepaton’s technique could give it a performance advantage over some of its competitors. However, the technique requires a disk cache for the full backup stream, a requirement that other vendors don’t face.

But because the disk cache is released and re-used after each backup, Biggar does not view the cache requirement as a negative issue.

“We don’t want any additional software layered onto the VTL to slow down performance. That’s why we let the backup set complete and do the data de-duplication post process,” says Linda Mentzer, vice president of marketing at Sepaton.

According to Mentzer, de-duplication lowers the cost of doing backups. “Without data de-duplication customers might be paying $4 or $5 per gigabyte. With it they could get down to the $1.50-per-GB range without changing the underlying physical storage.”

Sepaton has shipped DeltaStor software to a handful of its “strategic customers,” but volume availability isn’t expected until the fourth quarter. Sepaton says DeltaStor software licensing will be priced at less than $1 per GB.