LTO loader provides impressive performance

Posted on February 01, 2006

RssImageAltText

Tandberg Data integrates its half-height LTO2 tape drive with clever robotics to solve SMB and remote-site backup woes.

By Jack Fegreus

Targeting tape automation for small to medium-sized business (SMB) sites and remote applications, Tandberg Data introduced the first 1U autoloader based on an LTO2 drive. To minimize rack space, the StorageLoader LTO2 uses Tandberg’s half-height 420LTO drive and features a simplified robotics and cartridge-loading mechanism. Dubbed a “loader” because it has a single drive, the StorageLoader LTO2 has all of the capabilities of a large tape library, including random access of tapes, bar-code identification of media, and a Web-based management interface.

To examine the potential performance of Tandberg’s StorageLoader LTO2 in an SMB environment, openBench Labs installed the tape loader, which features an Ultra160 SCSI interface, on a typical SMB midrange server: an HP ML350 G3 running both SuSE Linux and Windows Server 2003. In addition, we tested the performance of the system’s 420LTO drive against two closely competitive drives: Quantum’s half-height CL400H LTO2 drive and entry-level SDLT 320 “super drive.” Both of those drives are used in different models of Quantum’s 2U SuperLoader.

Since many library designs allow the interchange of a number of different drives, loaders and libraries are often distinguished by tape drive characteristics, such as transfer rate and cartridge capacity. In particular, drive performance within the SMB arena is distinctly stratified: DAT dominates the low-end, DLT (now dubbed DLT-V) claims the midrange, and DLT-S (a.k.a. SDLT) and LTO dominate the high-end. DLT and LTO technologies represent the lion’s share of drives used in libraries and loaders.

To determine the Tandberg 420LTO’s probable upper and lower bounds for throughput during backup-and-restore operations, we ran our oblTape benchmark on Linux to minimize system overhead and focus directly on drive performance. In particular, when benchmarking throughput parameters, we streamed data to tape in very dense 256KB blocks. The data written to these blocks was generated as either compressible (2:1) or non-compressible data. Benchmarks were then run in which the percentages of compressible and non-compressible data blocks were varied.


Plotting raw data throughput measured with our benchmark (left) gives a macro view of relative performance. Normalizing the data (right) reveals distinct differences in drive characteristics between the half-height LTO2 and SDLT 320 drives.
Click here to enlarge image

The difference between the upper and lower bounds on throughput can be about 3:1. That range is greater than the 2:1 compression range because compression schemes add metadata about compressibility to the original data. When the original data cannot be compressed, that metadata is pure overhead. As a result, more bits are written to tape than are contained in the original data and the perceived throughput rate-calculated using the number of bits read rather than the number of bits written-is less than the native throughput specification.

That greater variation is also reflected in real-world backup performance. The throughput observed when writing data to tape is highly dependent on the characteristics of the data being sent to the drive and the ability of the drive’s electronics to handle fluctuations in data compressibility. During normal backup operations, differences in data compressibility from file to file make it more difficult to keep the drive’s buffer full, which in turn makes the drive prone to halting. When a drive halts, it must reposition the tape-which has been moving at speeds of up to 166 inches per second-before it can resume writing. Time lost while repositioning the tape can rapidly add up and dramatically lower the average throughput rate of a backup operation.

We verified the results of our synthetic benchmarks by running backup-and-restore tests on Windows and Linux with Computer Associates’ BrightStor ARCserve Backup r11.5, BakBone’s NetVault v7.1, and Symantec’s Backup Exec for Windows Servers v10.0. Using a 10GB data set, which was large enough to provide consistent, statistically valid results, openBench Labs ran a series of backup-and-restore operations. Our test data contained a mix of Microsoft Office files along with a mix of HTML and image files from Websites.

Along with compression, disk I/O is another key factor in keeping tape drives streaming. To ensure disk I/O throughput never impinges on a backup process, the source drive should be capable of sustaining an I/O rate that is at least 2x to 3x greater than the tape drive’s native throughput rate. To avoid any possibility of a disk bottleneck in our tests, we stored all backup-test data on a SAN-based RAID-0 array built using 15,000rpm FC drives from Seagate. Using our oblFileIO benchmark to read the same directory used for the backup tests, we pegged average file-based I/O throughput to be about 80MBps.

Tests with SATA-based RAID arrays, which are very prevalent at SMB sites, typically provided lower average throughput rates of about 50MBps. At that rate, any SMB site using SATA-based arrays should be able to take full advantage of the StorageLoader LTO2 and its half-height LTO2 drive. The I/O rate of SATA-based storage, however, precludes the possibility of achieving significantly greater benefits by using faster full-height LTO2 or LTO3 drives.


The size of the tape’s formatted data blocks directly affects native throughput. Using small blocks increases the number of interrupts in the flow of data and triggers a slowing of the drive’s speed. A number of backup packages that run on both Windows and Linux have a fixed block size of 64KB, the default maximum I/O size for Windows. A data block size of 128KB, however, can boost data throughput by 4% to 5%.
Click here to enlarge image

With the oblTape benchmark, we measured each drive’s throughput while it handled the full gamut of probable data-from completely non-compressible to highly compressible data. In addition, we also plotted the mean throughput for our 10GB backup-and-restore tests along the benchmark’s throughput curve to determine the overall compression rate that the drive was able to maintain during backup operations.

Both of the half-height LTO2 drives provided the best level of compression performance-either higher data compression or less metadata expansion-over the full range of our benchmark data spectrum. The SDLT 320 drive began to expand the amount of data being written to tape as soon as the percent of compressible data dropped to 30% of the test stream; however, Tandberg’s 420LTO drive only expanded the amount of data written to tape when compressible data represented less than 10% of the stream. This superior handling by LTO2 drives is the result of technical features, including variable speed tape transport and the implementation of Adaptive Loss-less Data Compression, which uses two compression schemes.

Streaming tape at varying speeds counters buffer overflow and underflow conditions, which cause the drive to halt and reposition the tape. Without a fixed tape speed, the notion of a native data-transfer rate becomes at best a moving target and at worst a theoretical construct. With tape speed variable, uncompressed throughput becomes dependent on the flow of data, which makes the size of data blocks a significant factor.

To determine optimal effective native throughput, we ran oblTape with hardware compression turned off and data blocks formatted in a range from 8KB to 256KB. Those results pegged the highest effective native throughput at 21.3MBps for Tandberg’s 420LTO drive, which was 20% higher than what we measured on the Quantum half-height LTO2 drive. Uncompressed data throughput on both LTO2 drives varied from published theoretical specifications for native throughput by about 10%.

For better compression performance, both LTO2 drives first employ the LZ1 compression algorithm on data and place those results in a history buffer. Next, the drive compares the size of the original data record to the data record in the history buffer. If the data record in the history buffer is larger than the original, then the original is written to tape. This makes the drive’s native throughput the performance baseline. In comparison, throughput on the SDLT 320 degrades significantly with compressed data.

Compression efficiency is an important factor for real-world backup performance. Mail, database, and graphics applications frequently provide some level of data compression within the application to help manage disk space. As a result, data files associated with these critical sources of business information exhibit lower throughput rates during a backup and depress the overall throughput rate.

Our standard 10GB backup test set includes a mix of Microsoft Outlook, Access, JPEG, and GIF files. Running BakBone’s NetVault v7.1 using tapes formatted with 128KB data blocks, Tandberg’s 420LTO provided measurably better overall performance than Quantum’s SDLT 320. While the native throughput rate of the 420LTO is 36% greater than that of the SDLT 320, our tests pegged average backup throughput to be 60% greater. Thanks to intelligent data compression and variable speed firmware features, backup throughput was 67% higher than what would be expected assuming equal compression efficiency.


During a backup, disk I/O paralleled the changes in tape speeds as drive firmware balanced disk reads with tape writes. When backing up e-mail folders compacted by Outlook, throughput slowed to about 35MBps (compression ratio about 1.5:1). Sparse matrix files from a database were compressed at better than 4:1. To keep tape drive buffers full and prevent the 420LTO’s firmware from triggering the Automatic Variable Transfer Rate feature to slow tape speed this data needed to be read from disk at about 88MBps.
Click here to enlarge image

While library characteristics are often defined in terms of features associated with its drive, a library’s price is more reflective of the cost of the unit’s robotics. In the case of Tandberg’s 1U loader, the robotics mechanism is remarkably simple. The picker travels along a single axis running the length of the loader and rotates 90° to the left or right to access media loaded in two four-cartridge magazines that are positioned along both sides of the loader: The 420LTO drive is positioned at the end of the track at the back of the StorageLoader.

The comparison of libraries is further complicated by the tight coupling of features with software. Library functionality depends on software just as much as it does on the robotics hardware. For most sites, this software will come as embedded Web/Java utilities on the device along with library access modules that are integrated into the backup software package.

Tandberg’s StorageLoader features a Remote Management Web Interface (RMI) that enables an operator to configure, test, and manage the loader using a Web browser. Via the RMI, administrators can perform most of the operations that can be done through the loader’s front control panel.


Enterprise backup applications such as CA’s BrightStor ARCserve support the functionality of Tandberg’s StorageLoader LTO2. As a result, all of the scheduling capabilities of the backup application can be used to automate a full GFS backup scenario.
Click here to enlarge image

For most sites, however, the principal issue will be the automation of a reliable plan for performing unattended backups. Within the context of a typical grandfather-father-son (GFS) backup rotation scheme, complex storage solutions often present very daunting problems for a small business with limited IT expertise.

A key factor for any site will be the amount of time needed to complete a backup process. The issue of a backup window becomes more important in 24x7 operations. There are numerous workable solutions, but many of them frequently involve complex storage schemes and constructs such as snapshots and data mirrors. As a result, the ability to integrate a low-cost tape loader with an automated backup schedule created within a standard package becomes a marketable offering for IT consultants and solution providers with SMB practices. The same is true for any large IT site with remote offices or collocation sites to backup and service.

Easing that task, the library management modules of all of the major backup applications for Linux and Windows recognize the autoloader’s robotic capabilities. As a result, the scheduling facilities of an enterprise backup application can be used to automate a GFS backup routine using the StorageLoader LTO. From an operations perspective, such a scheme can be extended to the point where an administrator need only rotate media magazines once a week.

Jack Fegreus is technology director at Strategic Communications (www.stratcomm.com). He can be reached at jfegreus@stratcomm.com.


openBench Labs scenario

UNDER EXAMINATION

1U autoloader with half-height LTO2 drive

WHAT WE TESTED

Tandberg Data StorageLoader LTO2

  • Tandberg 420LTO tape drive
  • 1U form factor
  • Ultra160 SCSI interface
  • Two 4-cartridge LTO2 magazines for 3.2TB of compressed (2:1) data

HOW WE TESTED

  • HP ProLiant ML350 G3 server
  • LSI Logic 22320 PCI-X Ultra320 SCSI HBA
    • Supports 256KB data transfers
  • nStor 4520 Storage System
    • Seagate 15,000rpm Fibre Channel drives
    • 4-drive RAID-0 arrays
  • Quantum SDLT 320 tape drive
  • Quantum CL 400H tape drive
  • CA BrightStor ARCserve Backup r11.5
  • BakBone NetVault v7.1
  • Symantec Backup Exec for Windows Servers v10.0
  • Microsoft Windows Server 2003
  • SuSE Linux Professional 10.0

BENCHMARKS

  • oblTape v2.5
  • oblFileLoad 1.0

KEY FINDINGS

  • Benchmarks pegged native throughput for Tandberg’s 420LTO at 21.3MBps.
  • Tape should be formatted with 128KB blocks for optimal throughput.
  • Full control of the StorageLoader LTO2’s robotics is provided through a Web interface.
  • Robotics recognized by all major backup applications.


Comment and Contribute
(Maximum characters: 1200). You have
characters left.