Using software for tasks ranging from RAID level processing to iSCSI traffic aggregation, Rasilient's RASTOR 7500 iSCSI disk array handles I/O streaming and database applications.
By Jack Fegreus
—Rasilient Systems' RASTOR line of storage systems integrate custom-tuned Linux software and rely on software to provide RAID, provision virtual disk volumes, and manage iSCSI connectivity. For enterprise-class networking, the software balances iSCSI traffic load and fail-over using multiple NICs, while simplifying client management by presenting a single IP address.
Rasilient's load-balancing software works at the iSCSI network layer using an iSCSI protocol redirect feature, which was created to handle target devices that were busy or undergoing maintenance. This technique is compatible with Microsoft's MPIO implementation on client initiators, for which Rasilient also provides support so that clients can establish multiple connections for fail-over redundancy.
Relying on software allows Rasilient to easily mix-and-match hardware modules to tune system options. In particular, one of the RASTOR 7500 systems tested by openBench Labs used high rotational speed (15,000rpm) SAS disks to provide an optimal environment for applications oriented toward transaction processing, such as MS Exchange or Oracle. On the other hand, our second RASTOR 7500 came equipped with 24 700GB SATA drives for an optimal platform for digital content and Web 2.0 applications, for which large files and streaming I/O are defining characteristics.
SAN technology has long been the premier means of consolidating storage resources and streamlining management in data centers. For many mid-tier companies, the rapid adoption of Gigabit Ethernet has revived the notion of implementing a low-cost iSCSI SAN in their IT production environments. Meanwhile, many large enterprise sites are starting to employ iSCSI to extend provisioning and management savings to remote severs.
Also fueling growth in the iSCSI market is the adoption of server virtualization. In a VMware Virtual Infrastructure (VI) environment, storage virtualization is a much simpler proposition. The VMware ESX file system (VMFS) eliminates the issue of exclusive volume ownership. More importantly, an advanced VMware environment is dependent on shared storage. In particular, VMware can use shared iSCSI SAN storage to liberate IT operations from the limitations of backup windows via VMware Consolidated Backup (VCB) and to balance active virtual machine (VM) workloads among multiple ESX servers via VMware VMotion. This fundamental reliance of VI features on a shared storage infrastructure opens the door for iSCSI as the backbone of a cost-effective SAN.
In the lab
To assess the RASTOR 7500, openBench Labs set up two test scenarios that represent frequent storage issues at mid-tier businesses. In the first test, we focused on the I/O functionality and performance needed to support digital content creation, editing, and distribution. In the second test, we looked at a storage consolidation scenario underpinned by transaction-oriented database I/O.
In these test scenarios, openBench Labs utilized two RASTOR 7500 storage systems. Both storage servers featured a 2U single-controller design built on a dual-core AMD Athlon CPU. For higher availability and throughput, the RASTOR 7500 also supports a dual active-active controller configuration. In addition, each controller system had a secondary storage shelf with 12 more drives. Finally, for high-throughput SAN connectivity, each controller featured six Gigabit Ethernet NICs.
What distinguished the two arrays arose from the disk drives that populated each array. The first storage array had 24 Seagate SATA drives. The rotational speed of each SATA drive was 7,200 rpm and its unformatted capacity was just over 700GB. The other system housed 24 Seagate SAS drives. Each SAS drive had a rotational speed of 15,000 rpm and a capacity just over 146GB.
With software-based RAID and iSCSI load-balancing as core elements of the RASTOR value proposition, we also configured a 2Gbps Fibre Channel array that featured hardware-based RAID and 15,000rpm Fibre Channel drives as a baseline comparison. With 2Gbps devices the most prevalent infrastructure at sites with a Fibre Channel SAN in place, the nStor 4540 Storage Array served as an excellent base of comparison for an iSCSI array targeting mid-tier business sites with enterprise throughput requirements.
A system administrator configures a RASTOR 7500 via an intuitive Web-based interface. The first step in this process is to assign physical addresses to the disk arrays, as well as to each individual NIC that will be utilized in load-balanced iSCSI traffic.
More importantly, Rasilient's software load balances storage traffic using the iSCSI protocol rather than by teaming NICs in a Level-2 networking scheme, which can suffer measurable overhead when reassembling out-of-order TCP segments. As a result, many NIC teaming schemes put packets from the same TCP session on the same physical NIC, and that prevents "n" cards working as one logical card from providing "n" times the performance gain.
On the other hand, Rasilient's explicit storage-oriented approach to iSCSI load-balancing is highly focused on performance. Rasilient starts with support of jumbo TCP packets for optimal network throughput. Nonetheless, it is the utilization of a storage protocol rather than TCP segments that sets the Rasilient load-balancing scheme apart from many competitors and makes it fully compatible with Microsoft's MPIO and Multiple Client Session (MCS).
In effect, Rasilient stripes iSCSI packets across all NICs for full load-balancing. In that way, multiple clients can utilize full gigabit-per-second throughput when connected to logical disks exported by the RASTOR 7500. What's more, by supporting active-active MPIO connections, Rasilient ensures high-availability iSCSI sessions for clients. As a result, enterprise-class clients can leverage advanced throughput and redundancy capabilities to maximize the benefits of an iSCSI SAN.
For this assessment, openBench Labs employed quad-core Dell PowerEdge 1900 servers running the 64-bit version of Windows Server 2003 R2 SP2 on the client side. In each test server, we also installed a dual-port QLogic 4052C iSCSI HBA to minimize overhead and maximize I/O throughput. With both a TCP and an iSCSI offload engine, the QLogic iSCSI HBA eliminated TCP packet processing and iSCSI protocol processing, which can be prodigious if enhanced data security is invoked on the iSCSI packets via header and data digest CRC calculations.
To enhance the resilience of our iSCSI SAN, we leveraged the RASTOR array's support of MPIO to invoke port fail-over on our iSCSI HBA. In particular, we used version 2.06 of the Microsoft iSCSI initiator in conjunction with the QLogic iSCSI HBA, which the Microsoft software initiator immediately recognized. With active-active connections and a round-robin fail-over policy—the default is an active-passive fail-over configuration—fail-over was instantaneous and we were not able to measure any degradation in throughput when a connection was physically disconnected.
From portable medical records to security surveillance, and even high-definition video, a torrent of new data sources continues to feed the burgeoning volume of data stored on disk. Video postproduction for standard-definition content has moved from tape (linear access) to disk (non-linear access) and in the process became a popular example of a Web 2.0 application. Moving video postproduction from tape to disk has fostered a growing market for non-linear editing (NLE) systems that need to support a number of key functions. Base NLE features include capturing digital content, editing—including special effects and graphics enhancement—and finished video rendering. As a result, any underlying storage server must be capable of supporting concurrent recording of broadcast material, modifying prerecorded data, and broadcasting presentations.
What's more, NLE systems have a natural affinity for a SAN. By handling media content as a digital file, an NLE system allows users to manage that content over its entire lifecycle. Moreover, any data lifecycle management process is enhanced by the presence of a data networking infrastructure. In particular, video operations stress traditional storage systems in terms of capacity and throughput: Two hours of uncompressed 1080i video will consume over a terabyte of disk storage and an NLE system will need to access data at a rate around 165MBps—greater than a single Gigabit Ethernet connection can deliver—to create it.
For digital content that relies heavily on streaming large files, the RASTOR 7500 equipped with SATA drives is an excellent fit. Provisioning begins with selecting unused drives and placing them in a new RAID array. Once the disk group is created, an administrator can partition the disk group in order to create logical drives that will be presented to client hosts.
In addition, using the RASTOR 7500, administrators can control key storage characteristics, such as write and read-ahead caching policies, locally at the logical disk level rather than just globally for an entire array. As a result, two logical disks partitioned from the same RAID array can take on very different I/O throughput performance characteristics. With the RASTOR 7500, an administrator can take a much more fine-grained approach to storage optimization that can be application specific.
Once a virtual disk is created, the final step is to virtualize its ownership, or in the argot of the RASTOR GUI—as well as a growing number of other vendor's arrays—"present disks to hosts." By default, a newly created logical disk is presented to every host, which is defined by an iSCSI initiator iqn on the iSCSI SAN. As a result, we needed to create two distinct host IDs for our Dell PowerEdge 1900 server: A separate identity was required for each of the two ports on the host's QLogic QLA 2552 iSCSI HBA.
For an environment with physical clients running Windows, logical disk virtualization is vital as Windows desktop and server operating systems do not have a distributed file- locking scheme for sharing storage at the block level. In a VMware Virtual Infrastructure environment, however, host sharing of SAN volumes is vital to advanced functions such as VMotion and VMware Consolidated Backup.
In our initial benchmarking tests, openBench Labs concentrated on assessing the performance of the RASTOR 7500 in a digital content scenario, which now extends to video surveillance. In these tests, the primary issue for accelerating application throughput is the streaming of sequential reads and writes. To a lesser—but growing—degree, however, streaming media applications associated with Web 2.0 initiatives are also dependent on random data access to support such functions as non-linear editing (NLE).
We began by examining streaming read and write I/O performance to a single 25GB logical drive backed by a physical RAID-0 array. In this set of tests, the physical arrays were resident on RASTOR 7500 arrays with SAS and SATA drives, as well as an nStor 2Gbps Fibre Channel array, which utilizes hardware-based RAID.
With software-based RAID on the RASTOR 7500, I/O on logical disks exported by the array was much more characteristic of Linux, which the array was running, than Windows Server 2003, which our host was running. The key difference when streaming sequential I/O is that Linux attempts to bundle all small I/O requests into 128KB blocks. When streaming I/O for large files, that results in a rapid convergence of throughput to the maximum level sustainable by the I/O subsystem. This is particularly important for applications on Windows, which often make 8KB requests. In contrast, storage arrays that rely on hardware-based RAID, such as the Fibre Channel-based nStor 4540, pass the Windows I/O requests directly to the hardware for fast response without changing the characteristics of the I/O requests.
In our tests with the RASTOR 7500, openBench Labs measured both 8KB reads and writes in excess of 100MBps using both the SATA- and SAS-provisioned arrays. That performance level pegged streaming small-block I/O on the iSCSI array as having a 50% advantage over the 2Gbps Fibre Channel array. Even by forcing a conservative write-through caching policy, we also measured little difference in the performance of writes with differing RAID levels. More importantly, using a normal default for safe write-back caching obliterated any measurable differences.
We configured our Windows Server 2003 host for high throughput of iSCSI data and high availability of iSCSI sessions. For high throughput we enabled jumbo packet support on the RASTOR 7500: In addition to the storage array and client NIC, all switches between the two devices must also support jumbo TCP packets. For high availability via the automatic fail-over of iSCSI sessions, we enabled MPIO support on the RASTOR array. MPIO allows clients to establish multiple active connections to logical disks exported by the storage array without the client OS interpreting each connection as an independent logical disk.
To leverage MPIO on the RASTOR 7500, we needed to provide MPIO support on our Dell 1900 PowerEdge server. To implement MPIO on our server, we utilized version 2.0.6 of the Microsoft iSCSI initiator in conjunction with the two ports on our QLogic QLE4052 iSCSI HBA. This configuration allowed us to setup dual active-active connections to each logical drive exported by the RASTOR 7500.
It's important to note that back-end MPIO support on the RASTOR array provides network load-balancing of connections for iSCSI sessions from client systems. With MPIO, a specific iSCSI session handles all of the I/O to a logical disk at any particular instant. As a result, the throughput for any particular benchmark instance was limited to the throughput of a single 1Gbps connection. Configuring active-active connections for iSCSI sessions ensures automatic fail-over will prevent iSCSI sessions from being interrupted when connections are interrupted.
To test the effect that the RASTOR's back-end NIC striping scheme has on throughput scalability for hosts, we needed to set up multiple iSCSI sessions to multiple logical disks. With each iSCSI session tied to a port on the server's iSCSI HBA, read or write throughput on each iSCSI volume was limited to 125MBps. As a result, scalability would only be evidenced in total throughput to multiple drives. True to form, I/O continued to exhibit Linux characteristics with multiple drives, as writes provided the best I/O scaling. Cumulative write throughput to two logical disks reached 230MBps with SAS drives. Furthermore, small 8KB I/O again provided nearly the same throughput as 32KB and 64KB accesses.
For streaming digital content applications, storage capacity and throughput go hand-in-hand as the primary concerns. On the other hand, applications built on Oracle or SQL Server typically generate large numbers of I/O operations that transfer data using small block sizes from a multitude of locations dispersed randomly across a logical disk. In such a scenario, the spotlight is on fast disk access to maximize processing large numbers of I/Os per second (IOPS). Applications that rely at least in part on transaction processing, such as SAP and Microsoft Exchange, put a premium on the minimization of I/O latency through data caching and high-speed disk rotation.
In many SMB transaction-processing applications, the number of processes involved in making transactions is often limited to a few proxies. Microsoft Exchange provides an excellent example of such a transaction-processing scheme. Exchange utilizes a JET b-tree database structure as the main mailbox repository. An Exchange store and retrieve process, dubbed the Extensible Storage Engine (ESE), takes transactions passed to it, creates indexes, and accesses records within the database.
To assess potential RASTOR 7500 performance in such SMB transaction-processing scenarios, we ran Intel's IOmeter benchmark. With IOmeter, we were able to control the number of worker processes making I/O read-or-write transaction requests and tune those processes to limit the number of outstanding I/O requests—the I/O queue length. In particular, we utilized one process and varied the I/O queue length from 1 to 30 outstanding requests. We then tested these conditions on various database sizes.
During each benchmark test we recorded the number of IOPS processed and the average response time for each IOP. Using small I/O request sizes—we utilized 8KB reads and writes in all of our tests—IOmeter stresses data access far more than it stresses data throughput. For comparison, we ran the IOmeter tests using volumes exported from the nStor 4540 Fibre Channel array and the RASTOR 7500 SAS array.
To analyze I/O performance, we plotted the average number of IOPS as a function of the outstanding I/O queue depth. In that context, archetypal IOPS performance follows a distinct pattern: As the number of outstanding I/O requests begins to increase, the IOPS completion rate increases by an order of magnitude. Continuing to increase the number of outstanding I/O requests, however, leads to an inflection point in IOPS completions. At that point, the scalability of the I/O subsystem breaks and additional outstanding I/O requests begin to overwhelm the I/O subsystem with overhead; the rate at which IOPS complete flattens; and the average response time for an IOPS begins to grow dramatically.
In particular, when openBench Labs tested a 1GB file on the nStor Fibre Channel array, we needed to allow the IOmeter worker process to have 15 outstanding I/O requests in order to reach a transaction completion rate of approximately 2,000 IOPS for read requests. On the other hand, the RASTOR array was able to process random reads from a 1GB file almost entirely from cache. With an outstanding I/O queue length of just five outstanding I/Os, the RASTOR 7500 delivered an IOPS completion rate of 20,000 IOPS. With 10 outstanding I/O requests, the RASTOR 7500 was completing an average of 28,000 IOPS.
That extraordinary cache advantage on reads seemingly disappeared when we employed a mix of I/O read-and-write requests—;80% read and 20% write I/O transactions. In that test, I/O completion rates using a logical drive exported by the iSCSI RASTOR 7500 and a logical drive exported from the 2Gbps Fibre Channel nStor 4540 were statistically identical. Using both the iSCSI RASTOR and Fibre Channel nStor arrays, we approached 2,000 IOPS with an I/O queue depth of 15 outstanding requests. Nonetheless, caching on the RASTOR array was still playing a role as the average read response time was 20% lower on the RASTOR with an I/O queue depth of 15.
With the reliability of a Patek Philippe Grand Complications Chronograph, the RASTOR 7500 was able to continue to minimize I/O latency by maximizing cache hits even as we expanded the size of the target file well beyond that of the system's total cache size. With an I/O transaction mix of 80% read and 20% write requests, reads on a logical drive exported by the RASTOR 7500 remained measurably faster.
As a result, the rate at which all I/O requests were completed by the RASTOR array continued to increase well after I/O processing on the nStor array was saturated. In all of our IOmeter tests, the RASTOR 7500 was able to use its aggressive caching scheme to boost the processing of I/O read transactions, even in a mixed I/O environment. This performance profile makes the RASTOR 7500 especially valuable in the context of IT operations at an SMB site.
Jack Fegreus is CTO of openbench.com. He can be reached at Jack Fegreus.
OpenBench Labs Scenario
iSCSI storage server
WHAT WE TESTED
Rasilient RASTOR 7500 Storage Server
- Logical volume management services
- Web-based GUI for storage provisioning
- Load-balancing based on the iSCSI redirect function
- Back-end support for clients implementing Microsoft
HOW WE TESTED
- Windows 2003 Server SP2
- QLogic QLE4052 iSCSI HBA
- nSTOR 4540 disk array
- 3,500 IOPS benchmark throughput (8KB requests)
- 1,30MBps benchmark throughput per iSCSI session (1Gbps connections)
- MPIO Session Management for clients running MS Initiator v2.0.6