By jack fegreus
With margins for storage devices as thin as gold leaf, Quantum and Maxtor are among the HDD manufacturers moving into the value-added space of network-attached storage (NAS). Even HP, which no longer manufactures drives, has moved solidly into the NAS space. While Quantum and Maxtor have initially targeted the SOHO business space, HP is aimed squarely at the IT market for high-end departmental storage servers.
The goal for all NAS vendors is to provide a storage appliance that can be unpacked, plugged in, and used as easily as a toaster. In essence, this means trading raw seething performance for simplicity of implementation and maintenance in what has become a new and more complex class of storage device. As if issues such as System on a Chip electronics, surface metrology, and MEMS were not complex enough, we are now introducing all of the design constraints of network topologies to the realm of data storage.
In the SOHO and home power-user markets, trading I/O performance for ease-of-implementation is a relatively straightforward task. In the RAID-dominated IT market, however, trading performance is a much more complex problem.
To see how HP is solving this problem, Data Storage magazine (a sister publication of InfoStor) benchmarked a Hewlett-Packard SureStore HD Server 4000. The benchmarks are available free and can be downloaded from www.novatechnica.com.
Like any good appliance, the SureStore HD Server 4000 comes nicely self-contained in a single cabinet. The model we tested came with three 9GB Wide SCSI-2 disk drives and an internal HP DAT40 tape drive for its own self-contained backup. The system is also compatible with network backup software from CA and Veritas. The cabinet has slots for six drives, which currently come in 9GB or 18GB capacities. Currently, HP is using IBM and Seagate drives; however, all of these drives are marked-both physically and in firmware-as HP drives.
For what HP dubs a "thin server," the system comes richly configured with 128MB of RAM, which gives it ample capacity for caching data. Driving the SureStore HD Server is a 300MHz RISC processor running a Unix-like embedded OS. Nonetheless, the current iteration of this storage server is tailored explicitly for the Windows market. Networking is done over TCP/IP with the capability to do passthrough user authentication in a Windows NT/2000 domain. Physical networking connections are made over 10/100Mbit Ethernet.
Administrators can configure the server using an LED panel or via the lingua franca of IT, the Web. Navigating a simple and limited set of menus, an administrator can configure or troubleshoot the SureStore HD Server 4000 in minutes. Logical volumes come in any variation of RAID, so long as that variation is RAID 5. As a result, at least three drives are required to create a logical volume, which can be dynamically expanded without having to backup and restore the data. There are no options to tailor stripe size or cache policy-i.e., write back or write through. Remember the idea is to take the system out of the box, plug it in twice (redundant power supplies), and instantly run a high-availability (RAID-5) storage server.
What price performance
From a systems administration standpoint, the HP SureStore HD Server 4000 certainly accomplishes its goal of trivializing the maintenance of a RAID-5 storage server. Furthermore by definition, a NAS server places no additional overhead on existing applications servers. The only remaining issue is at what cost all of this ease-of-use was purchased.
To answer that question, we used the Nova Technica Disk I/O benchmark to measure streaming read/write throughput and File Load benchmark to simulate query access in a database environment.
Testing was done in a Windows 2000 domain environment. Our client was a 200MHz P6-based Dell OptiPlex GXpro workstation running Windows 2000 Professional. We ran Windows 2000 Advanced Server on our designated file server, which was a dual 266MHz PII-based Dell PowerEdge 2200 Server with 256MB of RAM. More importantly, our network switch was a 3Com Switch 3000. Since we are evaluating a NAS storage scenario, the network is the limiting factor and throughput with this switch was up to 100% greater than others we had tested.
We tested two server-attached RAID-5 subsystems: (1) a high-end nStor 8e subsystem configured with Seagate Cheetah SCSI-2 drives and an i960-based RAID controller; (2) a classic makeshift subsystem configured with an Adaptec 2940 Fast Wide SCSI controller, older Quantum Fireball SCSI drives, and RAID functionality provided by the OS. At the client, we measure I/O throughput on reads and writes, and at the server, we measured client scalability and CPU overhead.
Probably the most surprising result of the testing was the performance of the software-based RAID-5 network share when writing 8KB sized blocks of data.
Not surprisingly, the hardware-based RAID subsystem from nStor outperformed the competition when performing streaming reads. In a Windows environment, the two dominant read sizes are 8KB, which is the de facto standard for applications software I/O, and 64KB, which is the data transfer size used by systems applications such as SQL Server for internal functions as well as by back-up packages to optimize throughput. Typically on Windows NT servers, I/O subsystems exhibit their peak throughput with 64KB data transfers. Interestingly, the HP SureStore HD Server peaked at 16KB transfers.
When measured at the client system, peak throughput for the nStor I/O subsystem was about half of what we had measured locally on the server. Tailored to support exceptionally high rates of I/Os per second-ideal in a transactional database environment-the direct-attached i960-based RAID subsystem was overkill in our simple network file-sharing environment. In this scenario, the network was the weakest link. As a result, it was the network that characterized I/O performance.
Much closer to each other in streaming I/O performance were the HP SureStore HD Server with its 40MB/sec SCSI-2 drives and our ad hoc storage system configured using an Adaptec 2940 SCSI adapter, 20MBps SCSI drives and software-based RAID as provided by Windows 2000. Once again, network overhead entered into the equation as a factor in I/O throughput. When compared to the SureStore HD Server, the Windows 2000 server transmitted on average 15% fewer frames when transferring an equal volume of data.
Equally important as raw throughput is the ability of a storage server to scale as the number of simultaneous users increase. As the number of users trying to access data increases-even in a simple file-sharing scenario-their access patterns tend to follow a distinct distribution pattern. To model this scenario, we used the database simulation option that is part of the Nova Technica File Load benchmark.
In this benchmark, 50% of the I/Os are randomly dispersed over 25% of a large test file to simulate index access. The remaining 50% of the I/O load is then randomly distributed over the remaining 75% of that simulated database file. As database daemon processes are added, they compete intensely for access to the index area. As the number of daemons increase, only a highly-effective system caching scheme can take the pressure off the drive's command queue to extend the responsiveness of the I/O subsystem.
The Dell PowerEdge 2200 server with its 256MB of RAM, therefore, had a distinct advantage over the HP SureStore HD Server 4000 in its ability to allocate memory to the system's file cache. Nonetheless, The PowerEdge 2200 running Windows 2000 Advanced Server had many more active services running and placing demands on the system to allocate memory than the HP thin server, which was exclusively dedicated to the function of file sharing. One of the key advantages claimed by NAS vendors is that a thin server with a much more modest configuration-single 300MHz processor with 128MB RAM versus dual 266MHz processors with 256MB RAM-can offer comparable performance with less maintenance.
When we scaled the client load, all three NAS system configurations were able to handle up to 50 simultaneous database clients-each issuing its own stream of several thousand I/O requests-without any difficulty. Interestingly, each NAS system exhibited a performance pattern that paralleled the performance pattern of each of the other servers to a remarkable degree. Each system rapidly rose to a peak throughput level at three database daemons. Then, from 4 daemons straight to 50 simultaneous daemons, each of the systems maintained a constant level of throughput and time sliced the I/O processing load among all of the database daemons to within ±1% accuracy.
Most noteworthy was the performance of the two PowerEdge 2200-based I/O systems. While the throughput for one database daemon accessing the nStor 8e-based storage over the network was approximately 25% greater than the throughput for one daemon accessing the Adaptec/Quantum-based storage system, the throughput of both systems converged at three daemons. This was a clear indication of a network or perhaps server bottleneck. The parallel performance of the HP SureStore HD Server 4000, however, strongly points to that bottleneck being a network-oriented rather than storage-oriented component.
While the total throughput of the HP SureStore HD Server lagged that of the PowerEdge-based systems, the CPU overhead of handing 50 different streams of I/O requests was not insignificant. Processing client I/O requests for 50 simultaneous database daemons imposed a 23% CPU processing overhead factor on the PowerEdge 2200 server.
We also measured the overhead for a steady stream of 8KB writes on the PowerEdge 2200. Writes under software-based RAID have always been a traditional sore spot for Windows NT. Under Windows 2000, the processing penalty was only 7% CPU utilization. This translated into an unexpected throughput performance advantage on RAID-5 writes exhibited by both of the PowerEdge 2200-based systems over the SureStore HD Server.
In a high-volume transaction-based environment, it would certainly be very hard to justify the SureStore HD Server. Such an enterprise-computing scenario is rightfully the domain of a storage area network (SAN), which creates a fabric of storage devices and systems woven together using high-speed Fibre Channel connections. A SAN fabric really bridges the constructs of direct-attached and network-attached for storage.
So, while performance limitations of the HP SureStore HD Server 4000 could be demonstrably against a well-configured and well-tuned Windows 2000 server, it is important to note the context of the environment for which this NAS device is targeted. A NAS file server is intended to be a complete functional out-of-the-box solution requiring no additional hardware or software, including NOS licenses. What's more, since NAS devices can be installed in minutes and can be managed from anywhere on a network using a web browser, significant savings can stem from reduced demands on the IT staff when compared with other storage alternatives.