By Heidi Biggar
Next month, Hewlett-Packard is expected to ship its StorageWorks Scalable File Share (SFS) to technical customers in high-performance computing (HPC) environments. HP SFS is based on the open-source Lustre protocol, which was jointly developed by Cluster File Systems, HP, and the U.S. Department of Energy (DOE) to address data sharing and management issues with the DOE's massive Linux cluster installations.
"While it is fairly easy to scale the computational side of Linux clusters by adding nodes, the limiting factor [in terms of scalability] is how fast the system can do I/O," says Kent Koeninger, product marketing manager in HP's high-performance technical computing division.
Koeninger says that while users can scale Linux clusters by adding lots of compute nodes, at some point they'll end up with a performance bottleneck. "HP SFS gives users the capacity and bandwidth for technical computing. How much performance they get depends on how big they build the file server."
The file server is expanded in a grid fashion by adding compute nodes, or what HP refers to as "smart cells" (see figure). Lustre technology virtualizes the compute environment, creating a single, sharable file system; files are stored across one or more cells depending on the size of the files. Small files are typically stored in single cells, and large files across many cells. Information about the data stored in these cells is kept in separate "metadata smart cells," which are used to access files.
In a recent InfoStor column, Jacob Farmer, chief technology officer at Cambridge Computer, discussed the issues surrounding building sharable Linux clusters using conventional file-sharing technology such as NAS (see "What are the options for cluster-based file sharing?", May 2004, p. 14).
"The compute power of clusters grows linearly as you add CPUs," said Farmer. "The catch is that conventional file-sharing technology is not scalable. In fact, it is common to see a diminishing return as you add nodes due to bottlenecks related to storing and retrieving data."
Farmer gave the example of a 1,000-node cluster being serviced by an NFS file server or NAS appliance able to sustain a throughput of 50MBps. "If you divide that by 1,000 nodes on a cluster, you get an average of 400Kbps per node. In other words, for any given node in the cluster you've got performance on the order of dial-up or DSL!"
HP SFS, in contrast, can deliver tens of gigabytes per second of bandwidth to hundreds or thousands of terabytes of storage, according to HP. Users simply add smart cells for more capacity, and the more cells you stripe data across, the better performance you get.
On average, each smart cell provides about 100MBps of throughout and 2TB to 3TB of capacity. "If you wanted 1GBps throughput, you'd add 10 smart cells," explains Koeninger. "HP SFS can give you as much bandwidth as you can afford."
But HP is not alone in its efforts to provide a scalable, sharable file system.
"There's no doubt customers need a means to harness the massive amounts of data generated today," says Ajay Anand, director of marketing for SGI's InfiniteStorage product line, "and our CXFS shared file system has been doing that since 1999."
SGI claims it has installed the CXFS file system in hundreds of production environments, including HPC sites.
Last month, for example, SGI announced that the National Center for Supercomputing Applications (NCSA) had installed CXFS.
In addition to SGI, a variety of other file-sharing technologies exist from vendors such as ADIC, IBM, Isilon, Microsoft, Panasas, Red Hat (as a result of its acquisition of Sistina), Sun, and Veritas.
Although these offerings differ in terms of their type (e.g., distributed, clustered, global, etc.), support (disk or server), and type of access (block or file), they all share the common goal of helping users get shared access to data quickly and easily.
According to SGI's Anand, while distributed file systems such as Lustre or Panasas' ActiveScale File System might offer "unlimited" scalability in terms of the number of systems that can participate, shared file systems such as CXFS offer "true" scalability of bandwidth, capacity, etc.
"With shared file systems, 100% of resources (capacity, bandwidth, etc.) are available to applications," says Anand. "In contrast, a percentage of each system in a distributed file system is used to maintain communication, which means that as the number of systems goes up, each system's capabilities (I/O bandwidth, cycles, etc.) go down."
As for side-by-side comparisons of file systems, it depends on who you talk to.
"I think the main story here is strategic rather than technical," says Charles King, research director at The Sageza Group. "HP has always owned a considerable piece of the HPC market but has been hammered lately by IBM in this space. SFS sweetens HP's HPC offerings."
Natalya Yezhkova, a senior research analyst at International Data Corp., says that while the HP and IBM architectures share some similarities (e.g., both use metadata and data storage modules), they are very different in other respects. The net result, she says, is two distinct value propositions: data availability and simplified/centralized management for IBM, and high bandwidth and scalability for HP.
"The two products target different markets," says Yezhkova. "IBM targets the enterprise, while HP targets technical computing environments—at least for now."
(For more information about file-sharing options for HPC and other types of environments, see "New file-sharing options address user demands," InfoStor, April 2004, p. 8.)
In its first release, HP SFS integrates ProLiant DL360 servers and EVA3000/6000 or lower-cost Serial ATA-based MSA20 StorageWorks disk arrays. The company says it plans to support additional StorageWorks and ProLiant hardware in the future.
HP has no plans to integrate the file system with non-HP storage; however, the company does expect to extend support to Windows and Unix server environments and eventually to block-level storage. No time frame was provided for these enhancements. The file system currently supports a wide variety of clients, including non-HP Linux clusters; it also supports NFS, which means that stored data is also accessible by non-Lustre clients.