Basic Tips for Evaluating Tape Libraries
If you haven`t undertaken the arduous task of evaluating large-scale tape libraries, here are some tips to get you started.
By Rod Wideman
Managers of information storage networks and departments are constantly making important decisions about how to handle critical company data. One such decision represents a sizable investment--the purchase of an automated tape library. This article explores evaluation criteria and suggests some features and functions to look for when selecting the system that`s right for your enterprise.
The first step is to select software and drive technology. Fortunately, most backup, archive, and hierarchical storage management software already support a broad range of tape libraries and drive types. On the other hand, most tape libraries only support one or two types of drive technologies, thereby narrowing your options significantly.
Once you have selected the drive technology, you`ll be able to put together a relatively short list of library candidates. But how do these libraries compare?
One of a library`s more obvious features is its capacity (number of cartridges and number of drives). These two numbers play an important role in calculating a library`s perceived value, efficiency, and performance.
The total number of cartridges equates to total data capacity, which is usually expressed in gigabytes or terabytes. By dividing the cost of the library by its total data capacity, you get the cost per MB. For example, a library that holds one hundred 10GB cartridges has a total capacity of 1TB, or 1,000,000MB. If the library costs $50,000, then the per MB cost is $0.05. But what does this $50,000 price tag actually represent? The cost of just the library? Or does it include drive, media, service, and support costs? The key is to use the same base cost when comparing libraries.
The number of cartridges is also used to indicate storage density, or how much data can be stored in a physical space. Storage density is typically described in MB per square feet or meters. If floor space is a concern, this parameter may be important. Also important is the cartridge-to-drive ratio. An improper ratio of cartridges per drive can result in performance bottlenecks.
A library`s drive capacity is often translated into a value that represents the aggregate data transfer rate. The more drives a library supports, the more data that can be transferred simultaneously, provided connectivity capabilities do not impede the transfer. For example, if the drive technology provides a data transfer rate of 3MBps, a 10-drive library would boast an aggregate data transfer rate of 30MBps, or 108GB per hour. Of course, this rate assumes the drives are running in parallel on separate data paths under optimal conditions.
When evaluating capacity or performance, however, beware of data compression claims. Most library vendors often state compressed (typically 2:1) capacity and transfer rates. These claims can be misleading and are often incorrect. Not all compression algorithms are the same, and user results typically vary from the stated average. To compare capacity and performance, use native (or uncompressed) specs. To use the previous example, if the cartridges hold 10GB of compressed (2:1 ratio) data, the actual cost per MB would be closer to $0.10/MB, not $0.05/MB. The bottom line: Always use a consistent baseline for comparisons.
Another factor to consider is investment protection: Will the library you buy today withstand the test of time? This does not mean you should buy a library that is multiple times larger than your current needs. But it does mean that you may want to consider a solution that is "expandable" or "scalable"--one that allows you to add cartridge capacity and drives as your requirements change.
The ability to convert between different drive technologies is also important. Your current drive technology may be succeeded by a newer, higher-performing, higher-capacity technology. You might also decide that a completely different drive technology is better for your application, or that a second type of drive technology complements the first in certain situations.
One solution is a library that supports various drive and media technologies. A "mixed-media" library does not become obsolete when your drive technology does. Also, it allows you to take advantage of a wider spectrum of drive technologies by providing a means to use them together or to migrate from one type to another in the same library.
Another selection criteria for tape libraries is connectivity to the platform on which your application software is running. This is often not as simple as you may think. Some libraries use a proprietary command interface, or a "closed"-system architecture. Others use a standard command interface, or an "open"-system architecture. Open systems, such as libraries that support SCSI connections and command interfaces, offer the most flexibility and are easy to use. Closed systems, on the other hand, require special application software, which must keep pace of changes to the proprietary command interface to the library. An open architecture allows you to change hardware platforms, application software, and other components without waiting for special updates or changes to support it.
Also, select libraries that have separate data and control paths. These spare users the headache of having to debug complex system problems, while maximizing overall system performance. Libraries use very little bus bandwidth, compared to tape drives. By separating the library control interface from the data path, library commands can get through better and do not disturb system operation for servicing requests involving other drives in the library. A separate control interface also helps avoid crippling the entire library should one or more drives have an interface problem or bottleneck. And typically a more cost-effective interface can be used for the library itself, reserving higher performance and more expensive interfaces for the data flow.
Another selection criteria is performance, which is defined differently by vendors (e.g., mounts per hour, exchanges per hour, swaps per hour, moves per hour, etc.). The common denominator is how fast the library moves a tape cartridge from one position to the next. Some libraries perform a single move in under five seconds, or some 720 moves per hour. A mount, exchange, or swap involves first removing a cartridge from the desired target location, then placing another cartridge there. This is two moves, or about 360 mounts/exchanges/swaps per hour. When comparing library performance, be sure to use consistent rates.
Directly related to performance is reliability. A library might be able to move tapes around in five-second clips, but for how long? The two most common ways of defining reliability are mean time between failure (MTBF) and mean cycles (swaps, exchanges, etc.) before failure (MCBF). The two are not the same, and neither one alone paints a complete picture of reliability.
MTBF is intended to reflect how long, on average, a library will operate before it fails. The problem with this description is the term "operate" and how it relates to real library usage. "Duty cycle" is sometimes used in conjunction with MTBF to indicate what percentage of the time the library is actually moving cartridges, and not just powered on. There is a big difference between an MTBF of 25,000 hours at a 20% duty cycle and at an 80% duty cycle. This is the shortcoming of MTBF ratings. Vendors sometimes give MTBF numbers with no indication of the duty cycle, so you have no idea how the library compares with other libraries. Most MTBF numbers are based on average library usage, which reflects the fact that though the libraries may be on 24 hours a day, they only mount cartridges for a few hours each day.
MCBF is used to express reliability in terms of what a library does--move cartridges. This rating estimates how many times a library mounts or exchanges cartridges before a failure occurs (e.g., 1,000,000 MCBF). As with MTBF, the MCBF often fails to indicate the expected duty cycle. If the duty cycle is high, a failure may occur in fewer cycles. Again, these values are an expression of a mean and do not account for all operating conditions. The key is to use similar reliability standards when comparing libraries.
Other selection criteria include vendor support, warranties, and typical repair times, all of which contribute to the overall cost of ownership. All tape libraries come with warranty and service options, but terms and conditions vary widely.
The final area of differentiation among tape libraries is usability. A library should be easy to install and configure. Ideally, it should configure itself. While this type of installation is available for some libraries, you should at least look for minimal manual interaction, or libraries that are installed by trained personnel.
Another usability issue is the library`s user interface. A good user interface should be intuitive and easy to use and should be rich in features for complete control. Features to look for include library status and command history information, setup options and utility functions, diagnostic and service routines, and help information.
Other details that enhance the usability of libraries are tracking and management features. Libraries that support cartridges with bar-codes make inventory quick and automatic. Also, look for libraries that allow you to eject and insert cartridges during normal library operation. Such "insert/eject stations" accommodate tens of cartridges. In addition, make sure you can manually locate, retrieve, and mount cartridge. The library should also allow easy physical access to the tapes.
In short, begin your evaluation by choosing the application software and drive technology that meet your application requirements. Consider libraries that offer investment protection-- ones that are scalable, support a mix of technologies, are easy to use, and have an open architecture--and look for the best performance and reliability available.
Key Cost of Ownership Factors
Major components of the total cost of a library include the initial purchase price, integration costs, and the cost of converting data to a new format, if necessary.
- Initial purchase price
The initial cost of a tape library often deters its acceptance. Adding the costs of the appropriate software, a full complement of media, and a host of options significantly boosts the apparent initial investment.
- Integration costs
Careful integration of the library subsystem and the host is necessary to ensure optimal performance. Hardware and software integration may require additional development work to accommodate specific system architectural or user needs.
- Data conversion
Converting existing data to a new medium for library automation can amount to one of the greatest costs of installing a library. Accurate and timely conversion of documents and files requires careful planning, which necessitates a thorough understanding of the user`s information as well as system capabilities and limitations.
Key considerations during the conversion process include data security, indexing, and retrieval algorithms. The conversion process can be time consuming, expensive, and error-prone--and often impedes market acceptance.
Source: Freeman Assoc.
Rod Wideman is the program manager of library engineering at EMASS Inc. in Englewood, Colorado.