NAS start-ups rely on distributed file systems

By Lisa Coleman

Two network-attached storage (NAS) start-ups offering ATA-based systems are targeting the fixed content—or reference information—market, which is expected to grow 92% per year, according to The Enterprise Storage Group (ESG). However, both Scale8 and Spinnaker Networks face an uphill battle against vendors such as EMC and Network Appliance, which have already staked claims in this market.

"Everybody is saying they're going after the fixed content market. It's a huge market, and it's unplowed ground at this point," says Randy Kerns, an analyst with The Evaluator Group.

Reference information is defined as any digital asset retained for active reference and value, according to the ESG. (See "End users prepare for the next data deluge," InfoStor, March 2003, p. 1.)

Both Scale8 and Spinnaker recently announced NAS systems that take advantage of cost-effective ATA disks and distributed storage/file systems offering significant scalability to target this market. What will differentiate both companies is the value that distributed systems will provide users, according to Steve Kenniston, a technology analyst with the ESG.

"Globally distributed file systems and volume management allow for the growth and morphing of additional storage, as well as additional levels of RAID protection," Kenniston explains. "Each time you add a storage node or blade you also add a storage processor. That processor distributes data and carries metadata throughout the whole environment, so you get RAID protection at the disk level and across the storage processors."

Scale8 recently announced the N2200 NAS system—comprising discrete nodes—that scales modularly, starting with three nodes and 6TB that can be expanded to 54 nodes and 108TB. Capacity and file systems can be expanded without downtime because of Scale8's distributed logical volume manager (DLVM).

The DLVM stripes files across storage nodes in a manner analogous to striping a file across multiple disks in a RAID array. The DLVM provides two layers of RAID 5 protection, called dual data protection, which allows the N2200 to provide fault resiliency. For example, in a three-node system with 36 disks, the system can withstand the failure of as many as 14 drives with no data availability interruption.

Each storage node includes 12 185GB ATA drives and a 2.4GHz Xeon CPU. Two models of the N2200 are available: A starting configuration with three nodes and 6TB is priced at $105,000, while a 12TB configuration costs $145,000. The system scales in 6TB increments. Head fail-over software costs $20,000.

"The DLVM provides compelling price-performance because of off-the-shelf ATA drives and Intel CPUs [coupled with] our distributed system," claims Patrick Rogers, Scale8's COO.

The N2200 can support up to 128 clients (Linux, Solaris, and Irix) with no additional client software. It supports NFS, but not CIFS (which will be included in a follow-on product). About 80% of Scale8's target market uses NFS for large data archives, according to Rogers.

Spinnaker's distributed file system

Spinnaker is also banking on a distributed file system. Last year, the company introduced its first NAS product, the SpinServer 3300, which consolidates file services by non-disruptively scaling a single file system from 1TB to 11,000TB.

Spinnaker's two-stage distributed file system, SpinFS, allows users to create one or more global file systems that can span hundreds of geographically dispersed SpinServers connected via LAN, MAN, or WAN. The system enables all the servers to be managed as a single resource from a single management console.

SpinFS allows configurations to be scaled non-disruptively. For example, a single file system can be scaled across an entire cluster—connected over switched Gigabit Ethernet—without requiring changes to user shares, mount points, or namespaces. SpinServer supports NFS and CIFS.

Last month, Spinnaker added ATA support to its SpinServer 3300, thereby planning to target not only the fixed content market, but also online backup, recovery, and data archival.

One of Spinnaker's customers chose SpinServer primarily for its flexibility, says Frederick Zarndt, CTO of iArchives, which is using SpinServers for its software that converts content into a customized database searchable over the Internet or an intranet. Once content is scanned, the iArchives software indexes the text, digitizes images, attaches any needed metadata, and stores the records in a database.

The company is currently scanning newspapers dating from 1885 for the Dallas Morning News. The content will equal about 1.5 million newspaper pages, with an average of 2MB per image.

iArchives purchased three SpinServers and 8TB. The company did consider a SAN, but it was too expensive and complicated, according to Zarndt. After evaluating storage from EMC, Hewlett-Packard, IBM, Network Appliance, and Procom, the Spinnaker system was chosen because of its distributed file system, clustering, and fail-over capability.

"The solution is both archival and fixed content," says Jeff Tabor, senior product manager for Spinnaker. "They want high performance for half of the system when they are scanning documents, but then when it's not being referenced, they can demote it to less expensive ATA storage."

SpinServer 3300 can connect to various types of storage, including ATA, the Spinnaker SpinStor Fibre Channel array, or SAN systems. Today, it supports LSI's SAN array, and Spinnaker is working on support for other arrays.

The 3300, including SpinFS and clustering software, will be bundled with ATA storage for about $50,000. Mirroring and server fail-over software are sold separately.

This article was originally published on April 01, 2003