EMC adds searching, chargeback to Centera

By Heidi Biggar

It's a well-known fact that a backup application is only as good as its recovery capabilities: How quickly and easily (if at all) can data be restored to users when needed? Similarly, a data archive is only as good its ability to locate, recover, and leverage the data it contains across the enterprise.

With the introduction of Centera Seek and Centera Chargeback Reporter this week, analysts say EMC has taken significant first steps to help users realize the true potential of a data archive by enabling them to perform more detailed searches and queries of Centera repositories, as well as making it easier for them to charge back individual lines of businesses (LOBs) or departments for actual capacity usage.

EMC Centera competes with data archive products from vendors such as Archivas (Arc), IBM (DR550), Hewlett-Packard (RISS), and Permabit (Permeon), among others.

"[Providing] visibility into both structured and unstructured data assets within the Centera device and outside the device across LOBs or an extended enterprise is a big deal," says William Hurley, senior analyst, applications and software infrastructure, at the Enterprise Strategy Group (ESG) consulting firm about the EMC Centera announcement.

For the search capability, EMC turned to FAST, a vendor of real-time search and filter technologies. FAST's InStream Index and Query Engine, which EMC has renamed Centera Seek, provides a software layer on top of which EMC or third-party applications, or "applets," can run.

"Centera's use of rich metadata distinguishes the platform [from its competitors'] and provides the necessary 'hooks' for FAST's search technology, [which can further] improve finding data quickly, accurately, and intelligently anywhere in the archive," says Hurley.

Chargeback Reporter is the first in what is expected to be a series of applets from EMC that will improve manageability of the Centera platform. In addition to chargeback, EMC has identified data purge management, legal discovery, capacity utilization reporting, archive virus scanning, and content-driven alerting as potential applet candidates.

"The reason we're putting Centera Seek into our environment is so we can enable a new family of applets right out of the box. [We chose Chargeback Reporter] first because it is what customers wanted the most," says Sean Lanagan, director of Centera product management and emerging markets at EMC.

Lanagan says that while users could previously do chargeback with the Centera platform, it often required a lot of manual scripting. Because Centera looks at the metadata, not the physical or logical configurations of the storage array or file system, Lanagan says users have much more flexibility in the way they do chargeback than they do with traditional storage resource management (SRM) tools.

"Because you're doing chargeback based on the content of the data itself (i.e., the metadata), if one day you want to do chargeback based on the IP addresses of the storage in the Centera and the next day you want to do it based on the application names, you just generate a new report. You don't have to reconfigure anything on the back-end," explains Lanagan.

Users can also upload data from Centera Chargeback Reporter (retrieved using Centera Seek) to third-party SRM applications. However, if users want to execute specific SRM features (which are not available as Centera applets), then the SRM package would have to be integrated first with the Centera API, says Lanagan.

As for plans for a full line of SRM applets, Lanagan says while there is an overlap between the two (SRM and data archive), EMC's development work boils down to intelligence and management. "You can call it SRM, but what you're really doing is managing the archive," he says.

"Archival solutions today maintain terabytes--or even petabytes--of data accumulated over a span of time, frequently from multiple applications representing multiple business lines, which makes locating and recovering data in these archives like finding a needle in a haystack," says ESG's Hurley.

Centera Seek supports query API access from either scripts or applications (C++, Java, .NET, http). The software runs on a stand-alone Dell server, which is placed out of the data path to minimize the I/O impact. The software regularly polls the Centera system for new or deleted data and keeps a record of changes in an index.

Centera Chargeback Reporter reports on bytes written and consumed, includes automated scheduled reports, and can export data via XML to third-party applications. The reports it generates can be used for trends analysis.

Centera Seek is priced from $4,000 for four Centera nodes, while Centera Chargeback Reporter is priced at $1,000 for four nodes. (Seek is required in order to run Chargeback Reporter.)

This article was originally published on March 01, 2005