Case Studies: Content-addressed storage (CAS)

By Kevin Komiega

—It is safe to assume that when any new storage technology hits the streets it's typically followed by a parade of naysayers and detractors and a subsequent battle among vendors over where, when, and how it should be used. But one storage technology appears to be bucking the trend: Content-addressed storage (CAS) systems have for the most part not created excessive hype or hot debates. They just work.

CAS systems are secure disk-based archival repositories that store data as individual objects with unique identifiers for long-term retention and retrieval. This technique ensures that once a piece of data is stored it cannot be altered in any way. CAS also allows users to set policy-based retention periods so that data can be managed from creation to deletion.

There are plenty of benefits. The ever-decreasing cost of hard disk drives means inexpensive capacity for CAS platforms, with the added bonus of faster data retrieval times than with traditional tape archiving systems. There is also a documented chain of custody for data, which becomes critical when discovery for regulatory compliance and litigation purposes rears its ugly head.

One of the main drivers behind many CAS purchases is regulatory compliance and corporate governance mandates. That was the case with CODA, a UK-based firm that provides financial accounting software for large global companies. Although it is not subject to the same stringent regulations as its US counterparts, CODA moved to CAS to get ahead of the curve. It also never hurts to practice what you preach.

"CODA is very much involved with the issue of corporate governance and we promote the importance of control, compliance, and repeatability for Sarbanes-Oxley. We realized that if some type of legislation similar to Sarbanes-Oxley is passed here we would need a solution," says Richard Hall, group IT manager at CODA.

CODA runs a StorageWorks Reference Information Storage System (RISS) from Hewlett-Packard to archive e-mails and other documents.

Compliance isn't the only reason people buy CAS systems. Performance is also a key motivating factor. And there are certain scenarios where performance can literally affect someone's health.

John Halamka, CIO of CareGroup Healthcare Systems and Harvard Medical School, is responsible for making IT decisions for several hospitals in the greater Boston area and knows firsthand how IT can impact patient care.

"The fact that our doctors were spending a lot of their time searching for archival images and studies was unacceptable," says Halamka. "Our restoration times using our old tape-based process were around 20 minutes—that is, if the data could be found at all. We've cut that time down to about 10 seconds using content-addressed storage."

Trying to crunch the numbers would drive anyone mad: Two thousand members of the medical staff spanning five hospitals sifting through medical records, X-ray films, and CT scans of about 2.5 million active patients—not to mention another 13,000 non-medical employees and administrators requiring e-mail and file access around the clock.

Needless to say, archiving records and files to tape did not provide the necessary retrieval performance. That's why Halamka's team implemented a CAS system, namely, EMC's Centera platform.

CAS hardware and software can be found in the product brochures of dozens of storage vendors, including the usual suspects such as EMC, HP, and Sun/StorageTek, as well as newer companies such as Nexsan Technologies and Permabit.

Software start-ups such as Avamar and Princeton Softech have also gotten in on the action by designing application-specific software for use in tandem with various CAS systems for the purposes of archiving e-mail data, database files, and a range of other types of content.

While CAS systems are hitting the mark on the whole, there are some minor enhancements that users think could make a good thing even better.

Curtis Damhof, a senior network administrator at St. Peter's Hospital, installed EMC's Centera CAS platform after battling with a DVD archiving system. Like CareGroup, St. Peter's needed access to patient records and medical images and, as expected, has been able to cut its restoration times down from several minutes to seconds.

Damhof's team is happy with all of the features and functions that Centera offers, including the performance benefits, security, and its ability to archive Microsoft Exchange mailboxes and e-mails. It's the management of the platform that he believes needs some tweaking.

"There could be better interoperability and overall management of the system," says Damhof. Specifically, he would prefer more automation.

Another area for improvement is search capabilities. CODA's Hall thinks the indexing and search capabilities of his HP RISS system would be more effective if the parameters were extended beyond the data center.

"One thing I'd like improved is search. If you store a copy of your e-mail on the RISS system and you are in the office, it has a powerful search engine, but it does not allow for a way to search for local, offline copies on your desktops," says Hall.

Both Damhof and Hall believe their suggestions are admittedly minor issues and that their respective vendors have given them what they need to meet their performance and compliance requirements.

One of the last challenges for CAS is to be able to store any type of content, whether it is unstructured data such as files and documents, semi-structured data like e-mails, or structured data in databases, and that process is well underway.

This article was originally published on April 11, 2006