MPC’s DataFRAME 420, with software from LeftHand Networks, unifies Fibre Channel and iSCSI SAN technologies, eliminates single points of failure, and creates a single point of management.
By Jack Fegreus
Last year, the volume of external disk storage grew 57.7%. However, measured in revenue growth, rather than capacity growth, the year was at best a modest success for vendors as revenues rose by only about 5%. Only in networked storage, both SAN and NAS, did revenue increases move to double digits-16.7%, according to International Data Corp. (IDC).
However, the big buzz continues to hover over what still commands a mere 2% of the total networked storage market: iSCSI. Nonetheless, iSCSI’s growth rate of 22% in the first quarter of 2005 makes it difficult to dismiss this sector of the market as irrational enthusiasm. This goes a long way to create an interesting backdrop for the MPC DataFRAME 420, a storage server fully capable of supporting Fibre Channel SAN connectivity, yet primarily marketed as an IP-SAN server.
As an IP-SAN server, the DataFRAME 420 goes beyond iSCSI and adds a protocol dubbed Ethernet Block Storage Device (EBSD). It is also important to note that for Fibre Channel access, it is necessary to install a QLogic host bus adapter (HBA). MPC has preloaded drivers for QLogic HBAs into the DataFRAME operating system. For our tests, we installed a dual-ported QLA2342 HBA into each DataFRAME that we tested.
The MPC DataFRAME 420 is an intriguing product on a number of dimensions. First, it is built from the ground up as a powerful Linux server. The operating system installed by MPC is a streamlined version of Mandriva Linux (a.k.a. Mandrake).
The base hardware is an Intel OEM-targeted Storage System SSR316MJ2, which is dubbed a storage system module (SSM) and sports dual Xeon CPUs, dual Gigabit Ethernet ports, and Serial ATA (SATA)-based RAID. On this foundation is Intel’s Storage Server Console, which is underpinned by the Java-based SAN/iQ software package from LeftHand Networks. The Storage Server Console is intended to simplify the issues that increase the complexity of SAN fabrics. That combination of hardware and software provides the basis of a powerful SAN storage server, as well as the makings of a grid-versus-cluster controversy.
To ensure fail-over, resiliency, and performance in a traditional SAN fabric, each server should be able to reach each storage node via two independent paths. To minimize latency, those paths should incur a minimum number of hops between Fibre Channel switches.
As a result, the topology of a SAN fabric goes a long way to defining the resiliency of the SAN to faults and disruption. Naturally, the equipment costs to add multiple controllers to arrays and multiple HBAs to hosts will be pivotal when choosing the level of redundancy supported by a SAN topology. All of that equipment can get very costly very quickly: The extra switches, HBAs, and cables that are needed to create an optimal fabric topology for a small SAN can add $25,000 to $50,000 to the cost of installing a basic SAN.
While such costs initially appear steep, they are dwarfed by long-term management costs. In analyzing the TCO of a storage infrastructure, the Gartner IT consulting firm pegs the soft costs of managing storage at three to five times more per gigabyte than the hard costs of purchasing the storage devices. That brings us back to the raison d’être of a SAN: reducing the complexity and cost of storage management.
Ideally, a SAN provides the means to manage all physical storage volumes from a single management interface. While a single MPC DataFRAME 420 can easily be set up on a SAN, the real strength of the software that comes bundled with the system is its ability to aggregate multiple storage servers.
In fact, when only one DataFRAME is installed, an administrator must still follow the steps required for multiple servers. In particular, an administrator will be required to create a management group and a cluster of one.
The Storage Server Console sees the world of SAN storage as existing within a distinct hierarchy of
- Management groups: Collections of SSMs, within which one or more are designated as managers;
- Clusters: Sub-groups of SSMs within a management group that form a storage pool within that management group;
- Volumes: Data storage partitions created in a cluster and exported as logical disks; and
- Snapshots: Read-only copies of a volume created at a specific point in time.
Under this scheme, a management group integrates a collection of distributed SSMs and makes the group appear as a single, large “virtualized” computing system.
SSMs on the network can be dynamically added to or deleted from a management group, which creates a federated approach to data integration, pooling, and sharing. As a result, administrators can consolidate storage into one or more data services, dubbed “clusters,” that hide all of the complexities of data location, local ownership, and infrastructure from the system.
If that description sounds like a cluster, welcome to the world of academic debates. While a cluster may indeed provide the most basic functionality of a grid, the technology of grids and clusters are worlds apart. As a result, asking whether a cluster is an instance of a grid can set off a controversy. There is considerable debate as to whether a local computer cluster should be classified as a grid.
There is solid agreement, however, that clusters are conceptually similar to grids. Both are dependent on middleware to provide the virtualization necessary to make multiple networked computer systems appear as a single system to users. Nonetheless, clusters are hardware-centric, while grids are network software-centric.
The underlying construct of network-centric grids and Web services is a Service Oriented Architecture (SOA). In an SOA, applications are built using distributed components dubbed services. The functionality of each service is exposed via a standards-based interface.
Under Windows XP, SATA drive throughput averaged 233% greater than Ultra100 ATA drive throughput. Since programs on Windows typically perform 8KB I/Os, desktop I/O is on the order of 14MBps and workstation I/O is 35MBps.
For the DataFRAME 420, the Storage Server Console, which can be installed on a workstation running either Linux or Windows, is the means by which services resident on the SSMs are discovered and accessed. Given that the Storage Server Console has its genesis in SAN/iQ software from LeftHand Networks, it’s no surprise that this software is network-centric and focused on IP-SAN technology. As a result, the DataFRAME 420 should be viewed as an IP-SAN server that supports Fibre Channel.
Via the Storage Server Console, MPC’s DataFRAME comes tantalizingly close to the utopian vision of the perfect SAN device. An administrator can log into a management group without logging into the individual SSMs and gain access to all of the global configuration parameters, including the configuration of each cluster or storage pool.
The level of abstraction in the Storage Server Console, however, can be initially distracting. Provisioning storage begins once the underlying RAID architecture is set on each DataFRAME. Here minimalism rules. The administrator is given an explicit set of choices for RAID configuration: RAID 5/50, RAID 1/10, or RAID 0. It is the system and not the administrator that decides whether to configure mirrors for RAID 50 or 10. That decision is based on the number of drives present in the SSM. Given this level of automation, it is not surprising to find that there are no options available to explicitly tune the RAID architecture.
Using Fast (100Mbps) Ethernet and an EBSD volume, our oblFilePerfbenchmarkeasily sustained throughput at 11.1MBps, which is on the order of a direct-attached ATA drive.
Nonetheless, what is lost in low-level configuration management is more than compensated for in high-level functionality. Each logical volume created in a cluster can be exposed and fully virtualized for host systems over either Fibre Channel or Ethernet. That capability alone should move the MPC DataFRAME onto a lot of short lists.
There are, however, even more services and features for ensuring reliability, accessibility, and scalability (RAS)-storage services that should be equally intriguing for any CIO looking to consolidate storage. These added services are delivered through three option packs: the Configurable Snapshot Pack, Scalability Pack, and Remote Data Protection Pack. Using the Configurable Snapshot Pack, an administrator can automatically schedule volume snapshots. Without this option pack, snapshots must be triggered manually.
The functional differentiator of the DataFRAME 420, however, really comes into focus with the addition of the Scalability Pack and Remote Data Protection Pack. Using the Scalability Pack, administrators can create clusters of multiple units within a management group. Each cluster can then provide a unified RAID storage pool across multiple SSMs.
Replication of volumes among SSMs can be configured as either two-way or three-way. With such replication, if one DataFRAME 420 were to fail, the data would still be available on the network. Finally, with the Remote Data Protection Pack, an administrator can maximize business continuity by pushing backup snapshots to a remote volume on another SSM that can be physically located at another location.
While all of these advanced RAS features are important, the once-arcane construct of storage virtualization remains the heart and soul of SAN management services. That’s because computer operating systems were never designed to share storage devices. Every host system will assume exclusive ownership of any block-level device in the SAN that it discovers.
This would lead to total chaos were multiple systems in a SAN allowed to discover and mount the same volumes with read-and-write access. In that event, each system will have an incorrect view and file system log with respect to the disks’ data structure. In such a case, the systems have the potential to destroy each other’s data and corrupt the structure of the file system, rendering the device incapable of being mounted.
To avoid that situation, the Storage Server Console provides a means to virtualize the volumes presented to hosts via every access method supported by the DataFRAME system: Fibre Channel, iSCSI, and EBSD. The virtualization scheme in the Storage Server Console centers on the creation of Authorization Groups, which are defined by rules specific to each method of access (Fibre Channel, iSCSI, EBSD).
Creating virtualization rules is a relatively simple process; however, an administrator will need to use external utilities to gather all of the information necessary to formulate authorization rules for Fibre Channel and iSCSI access. The Storage Server Console GUI does not provide a means to browse for this information.
Formulating a rule for Fibre Channel access requires the ID of the port on the Fibre Channel HBA (WWPN) through which the logical disk volume will be accessed. This data is readily available using fabric and switch-monitoring utilities bundled with Fibre Channel switches. To create an iSCSI Authorization Group, administrators need to provide the ID of the hardware/software iSCSI initiator used by the host server or work-station. This data is reported by Microsoft’s iSCSI Name Server utility (iSNS), which provides a list of IDs for both iSCSI targets-mountable storage volumes-and initiators-iSCSI-enabled workstations and servers-on the network.
For Linux systems, that iSCSI authorization scheme works well if a hardware initiator is in use. Unfortunately, finding a Linux system using an iSCSI initiator is about as common as finding a first edition Gutenberg Bible in the attic. Most Linux systems access iSCSI volumes using software and a standard Ethernet NIC. That raises a problem for the virtualization scheme on the DataFRAME 420. Unlike Microsoft’s software initiator, the configuration scripts on a Linux system do not broadcast an initiator ID. This makes it impossible to virtualize iSCSI volumes for Linux hosts via the Storage Server Console.
There is a solution for this problem and it comes in the form of the DataFRAME’s EBSD protocol. MPC provides drivers for both Linux and Windows systems. Based on iSCSI, EBSD reportedly fixes a number of iSCSI problems, including issues with Multi-Path I/O (MPIO). More importantly, virtualization of EBSD volumes is done using the IP address of the host’s Ethernet NIC. Using static IP addresses on the hosts that will access EBSD storage, the virtualization of storage volumes is a very simple matter. In addition, the use of IP address ranges makes it very easy to provide read-only access to a number of systems in an Authorization Group.
Both iSCSI and EBSD play well with the notion of extending the reach of a SAN out to the desktop. As with all consolidation projects, the burning issues for network edge devices are the efficient utilization of resources and business continuity. In particular, special attention is given to data security and recoverability. On all of these counts, desktop PCs along with special-purpose departmental servers have been very difficult for IT to encompass within consolidation projects.
That difficulty helps explain the lure of iSCSI. The simplicity of encapsulating SCSI commands and data in TCP/IP packets and transmitting them over Ethernet networks plays perfectly with the drive for greater IT efficiency. The minimal investment needed for the NICs, switches, and cables to set up a working Gigabit Ethernet fabric is a fraction of the cost associated with Fibre Channel components.
What’s more, performance requirements are far less demanding from the desktop perspective. Single ATA drives dominate corporate desktops, while Serial ATA (SATA) is gaining momentum in high-end workstations and low-end servers. Moreover, Windows XP and Windows Server dominate the operating system environments for these systems. As a result, most I/O operations will involve small 8KB data blocks.
For desktop I/O, direct-attached ATA drives will deliver 8KB I/O requests at about 14MBps. This puts the throughput bar low enough to use Fast (100Mbps) Ethernet in conjunction with the DataFRAME 420, which we demonstrated by running our oblFilePerf benchmark on EBSD and iSCSI volumes.
We next turned from Fast Ethernet to Gigabit Ethernet and Fibre Channel. Somewhat surprisingly, performance measured on a quad-processor HP DL580 G3 server running SuSE Linux Enterprise Server for EM64T systems exhibited little difference between iSCSI volumes mounted over Gigabit Ethernet and Fibre Channel volumes. Performance levels of iSCSI volumes were similar to the results measured using the VTrak 15200 Storage Array from Promise Technologies (see “iSCSI vs. FC, Windows vs. Linux,” InfoStor, April 2005, p. 44).
We had again run into the performance limitations associated with using SATA drives to provide high-performance RAID throughput. A key problem with the use of older SATA drives is the absence of support for command queuing. Unlike SCSI drives, some SATA drives cannot re-order I/O commands to optimize movement of the actuator arm.
New higher-speed SATA II drives will help mitigate this performance problem for SATA-based arrays, which consistently come up short when compared to arrays that use Ultra360 SCSI or Fibre Channel drives. InfoStor Labs will be testing these drives in the near future in direct-attached and iSCSI SAN scenarios.
For high-performance SAN applications, Fibre Channel still holds an edge over Ethernet. Throughput is also heavily dependent on the underlying RAID infrastructure: Logical volumes created from arrays based on Fibre Channel or Ultra320 SCSI drives have a distinct advantage over logical volumes underpinned with SATA drives. Nonetheless, when it comes to storage consolidation, raw performance is not a driving issue.
The main issue in storage consolidation, as in server consolidation, is the efficient utilization of resources. For storage consolidation, SANs decouple storage resources from servers. That decoupling allows storage resources, which can be physically distributed, to be centrally managed as a virtual storage pool. As a result, SANs allow administrators to take advantage of more-robust RAS features for data protection and recovery, such as snapshots and replication. This critical functionality is entirely independent of SAN infrastructure, which principally affects performance.
Nonetheless, SAN infrastructure costs have historically presented a significant hurdle to SAN adoption and expansion. As a result, the benefits of SAN architectures have not been spread beyond servers in computer centers. The key to changing this perception of SANs lies in Ethernet technology. With the functional capabilities of software like SAN/iQ from LeftHand Networks, which is at the heart of the MPC DataFRAME 420, SANs can now be easily pushed out to the desktop over existing Ethernet infrastructure. What’s more, low-cost SATA technology can be applied more efficiently in a SAN to provide equivalent performance at lower costs and greater functionality.
Jack Fegreus is technology director at Strategic Communications (www.stratcomm.com). He can be reached at firstname.lastname@example.org.
InfoStor Labs scenario
Grid-oriented IP-SAN Storage Arrays
WHAT WE TESTED
Two MPC DataFRAME 420
- Linux OS
- Dual 3GHz Intel Xeon processors
- Up to 12GB ECC memory (cache)
- Dual Gigabit Ethernet IP-SAN ports
- Optional QLogic Fibre Channel HBA
- Fast Ethernet management port
- 16 Western Digital SATA drives
- Java-based Storage Server Console
- RAID Levels 0,1/10, 5/50
HOW WE TESTED
- HP ProLiant DL580 G3 Server
- Four 3.3GHz Xeon EM64T CPUs
- SMP architecture with advanced memory protection
- 8GB 400MHz DDR-2 memory
- Online spare memory
- Hot-plug mirrored memory
- Hot-plug RAID memory
- SuSE Linux Enterprise Server 9 SP2
- AMD64/EM64T version
- Linux Kernel 2.6
- iSCSI support
- Gnu C 3.3
- Emulex LP 10000 Fibre Channel HBA
- PCI-X support for 133/100/64 MHz
- Full-duplex 2Gbps Fibre Channel
- Onboard hardware context cache for high-transaction performance
- nStor 4520 Storage System
- Two WahooXP RAID controllers
- 2Gb FC-AL interface
- 12 Seagate 15K FC Cheetah drives, 15,000rpm
- oblFilePerf v1.0
- oblWinDisk v3.0
- Maximum stripe size for RAID limited to 64KB
- Dynamic support for expanding and restructuring arrays
- No support for LUN virtualization
- Linux file throughput double that of Window Server 2003 using a single four-drive RAID-0 array.