By Heidi Biggar
While larger and more-established vendors are knee-high in visions of utility computing, start-up ExaGrid is making final preparations for the release of its Grid Protected Storage architecture, a framework analysts say creates a utility-computing-like environment for a variety of integrated storage services.
"The idea is to create a 'grid' of computing where the application can ask for and receive the compute, storage, and network resources it needs so it can deliver the performance required to meet stated service level agreements [SLAs]," explains Arun Taneja, founder of The Taneja Group consulting firm.
ExaGrid, whose management team includes former executives from various storage companies—notably the now-defunct Storage- Networks—claims to be the first vendor to apply the concept of grid computing to storage, in particular to disk-based backup, disaster recovery, hierarchical storage management (HSM), and data archival.
Not surprisingly, ExaGrid is also positioning its architecture as an information life-cycle management (ILM) tool. (See "The ABCs of ILM" on p. 1.)
While the company claims some traction at the department level, it says its primary focus is on the midrange market, specifically organizations with $100 million to $1 billion in revenues and with large amounts of file data. Specific vertical market targets include the healthcare, biotech, financial services, insurance, manufacturing, and engineering industries.
"We interviewed more than 150 companies and found that reliable data recovery is still a major problem for mid-tier users," says David Therrien, chief technology officer, at ExaGrid. Users cited issues with complexity (multiple products are required to store and protect data), reliability (restore attempts often fail), scalability (data-protection products are not keeping pace with data demands), and cost (data is being over-replicated and/ or under-protected).
ExaGrid's Grid Protected Storage platform reportedly provides end-to-end backup and restore, full and immediate disaster recovery, file-corruption protection, on-site/off-site vaulting, automatic data migration/HSM, archiving, and ILM in a shared, self-healing, automated environment.
According to Taneja, "The best way to visualize its functionality is to think of a NAS [network-attached storage] filer; combine it with NearStore for backup and archiving; and add HSM for migration of files based on policies, self-management so there are no volumes or LUNs to manage, content-addressed storage features [such as data authenticity, integrity checking, and location independent storage], disk-based fast restores at a file level but with versioning, remote replication, and disaster-recovery tools."
ExaGrid's architecture includes two hardware components: GRIDfilers, which are 1U NAS servers, and GRIDdisks, which are 1U servers with about one terabyte of ATA storage capacity each. These components form an inter-networked grid computing storage environment (see figure).
The GRIDfilers serve as NAS front-ends, or "gateways," to attached GRIDdisk repositories and are the primary storage components of the overall grid architecture. The filers support NTFS, CIFS, and FTP; Gigabit Ethernet; and applications such as Legato NetWorker, Veritas NetBackup and Backup Exec, and Oracle RMAN.
Proprietary software "virtualizes" the GRIDdisks into a single pool of storage that can be shared both locally and remotely. The repositories can scale to hundreds of terabytes of capacity and provide the data management functionality of traditional backup and replication servers, storage devices, and media, according to the company. And, like an increasing number of storage products, the filers use a content-addressed storage (CAS) technique to identify data.
A policy engine allows users to assign protection policies to the data for both local/remote storage and data migration (most frequently used files are cached locally in the GRIDfiler while least-accessed files are moved to the repository). Policies can also be set so that infrequently accessed data on storage area network (SAN) volumes is automatically moved to lower-cost GRIDdisk as needed.
Other software-enabled features include replication; file versioning, which allows users to perform point-in-time restores; data compression and compaction for optimum space usage; and long-term content protection, continuous content checking, and auto file repair.
According to ExaGrid officials, the architecture has been implemented at 14 beta sites, including CuraGen (a pharmaceutical company), Massachusetts General Hospital, The First Years (a marketer of parenting products), and two unidentified financial and technology firms.
In the case of CuraGen, the product was implemented to address data-protection issues with its distributed storage environment. Mass General, on the other hand, was looking to address issues with data archival. The First Years is using the system to replicate data between its London and Massachusetts offices.
The unidentified financial services firm reportedly installed the product as a way to lower overall SAN costs by moving infrequently accessed data off its SAN and onto ExaGrid NAS. The technology firm, meanwhile, implemented the architecture to address efficiency and cost problems with its existing tape-based backup environment.
According to ExaGrid officials, the Grid Protected Storage architecture will compete against a variety of vendors and product categories, including vendors of traditional backup software, disk-to-disk and disk-to-disk-to-tape hardware/software vendors, and "rapid-recovery" vendors.
"You might say we compete with Veritas, but we're also complementary," says Therrien. "We offer grid filers that are similar to EMC filers, but our filers could also serve as gateways to their filers. We also compete with disk-to-disk backup/recovery vendors, and to a degree with EMC's Centera, although we're not focused on the fixed-content market."
ExaGrid expects to begin shipping the product late in the first quarter; pricing has not yet been set.