Solution is based on a disk-tape hierarchical storage management system and AIT tape units.
BY THOMAS UHL
In a world of ever-increasing data growth, the question becomes: "Where can you store all of this data?" This was the dilemma we faced at Bosch Datacenter in Stuttgart-Schwieberdingen, Germany. Our primary storage systems were about to burst at the seams due to a deluge of data, so we decided to consolidate the storage subsystems to limit the steadily increasing costs and labor associated with data storage and management.
Bosch manufactures automotive equipment, including brakes, injection techniques, and driver information systems, as well as products in the communications, electronics, appliance, and automation markets. The company consists of nearly 250 subsidiaries and associate companies in 48 countries. It operates 190 sites-145 of which are outside of Germany-throughout the world. In Germany, Bosch has approximately 98,000 employees in 56 branch operations, in addition to its headquarters in Gerlingen.
Data management at Bosch was based on the principle that each location should handle its own networks and servers as well as its own on-site technical support. The Datacenters-as they are called at Bosch-handle their own data processing, with the company's headquarters responsible for networking among the various locations.
However, this scenario will not last. Dispersed server and storage areas will increasingly reach their limit in terms of performance and maintenance costs, and sooner or later there will be no way to avoid consolidating server and storage subsystems. The flood of data has led to an annual storage growth rate of between 50% and 100%, which calls for a storage-consolidation strategy.
The Schwieberdingen Datacenter assumed a leadership position in that it is one of the few Bosch Datacenters that also manages remote sites, and Schwieberdingen is one of the Bosch locations that has advanced the idea of storage consolidation. File serving of employees' application programs and the MS Exchange mail system had already been consolidated.
The Schwieberdingen Datacenter has eight Windows NT file servers and an EMC Symmetrix disk array with a capacity of approximately 4TB, serving nearly 8,400 employees. The system was originally installed with only 20GB of capacity and has continually been increased to handle the massive data growth.
The mail system, serving 14,000 users, faced similar rapid growth. Meanwhile, a data volume of almost 500GB had accrued on the MS Exchange server farm. Those in charge of the Datacenter realized that continuing to consistently extend hard disk capacity was an unreasonable approach because of cost and performance issues. Therefore, the Memory Space Optimization project was started to get a handle on the problem through a bundled solution.
It was clear from the beginning that the best solution was a hierarchical storage management (HSM) system, which migrates older and less-frequently-accessed data from the hard disks to less-expensive tape media. To achieve this goal, special software was first developed for the mail system, the Exchange link server, and the library server. The software scans the mail servers once a day and automatically sends to tape any mail larger than 10KB that remains unopened for more than four weeks. Approximately 1 to 2GB of data are migrated daily, but the process is transparent to users. The migrated mail remains in users' Outlook registers, marked with a special symbol, and can be recovered at any time or permanently deleted if necessary.
Bosch chose the Infinistore Virtual Disk (IVD) system from Grau Data Storage. The IVD is an integrated automation system that enables storage of large quantities of data, scalable to 20TB. The system integrates tape drives, media, robotics mechanism, a Windows NT server, RAID 5 hard disk storage, and management software.
The Infinistore system is available with Exabyte Mammoth-2 or Sony AIT-2 tape drives. The Schwieberdingen Datacenter chose Sony's AIT technology, which provides storage capacities up to 50GB (uncompressed) per cartridge.
Fast access, high reliability
It was important for us to get a very fast system with fast access and very high reliability. AIT technology provided us with the partition on the tapes, saving time spent on rewinding to a defined start and ending point. Another advantage of AIT technology is the memory chip integrated into the cartridges, which stores user-defined information that helps locate data in a fraction of the time required with other tape drives. AIT's high reliability was an even more important criterion at Bosch. There is close to zero tape abrasion, which was a key issue in our tape technology decision.
Stress and performance tests, recently conducted at Audi Ingolstadt, confirm that AIT-2 technology is very efficient and reliable. Tapes and drives underwent a total of 110,000 tape runs and 5,500 mounts per drive without noting any read or write errors. Performance tests confirmed writing rates of about 3.8MBps for uncompressed files and up to 10.2MBps for compressed files. The read rates were about 4.2MBps for compressed files and approximately 11MBps for uncompressed files. The load time for a tape to become ready for use is about 10 seconds. The slowest access time noted with a loaded cartridge was 62 seconds.
The IVD's architecture is based on a two-tiered storage architecture with a scalable RAID-5 hard disk system and migration of data to tape media according to HSM rules. The archiving system integrates seamlessly into the NT file system and represents an external logical drive, which users access the same way as they do the local hard disk. If a Bosch Schwieberdingen employee moves files or registers to the virtual drive via Windows Explorer, for instance, the files are initially written on the RAID-5 100GB cache. The HSM software migrates the files to tape daily at a predetermined time.
Two IVD systems are in operation at Bosch Schwieberdingen, each with four AIT-2 drives and 7.5TB of capacity. The systems are installed at separate locations, one of which serves as a production system for the file service and the second as a production system for the mail server. The files are replicated to the other IVD each hour. The replication function influenced our decision to purchase the IVD because it was important to us to save labor. This can be achieved only if some of the steps taken in the first stage are eliminated in the second. The largest task in the first stage is data backup, which is eliminated in the secondary storage system because of the IVD system's ability to mirror data.
Thomas Uhl is a team leader at the Bosch Datacenter in Stuttgart-Schwieberdingen, Germany.
Case study at a glance
Storage problems: Rapidly escalating capacity growth and storage management costs
Goal: Storage consolidation and data replication
Solution: Grau Data Storage's Infinistore Virtual Disk (IVD) disk-tape system with hierarchical storage management (HSM) migration software and Sony AIT tape technology