Tuning RAID for DBMS
IT organizations must devote more time to storage, particularly database applications, and tune storage subsystems to optimize application performance.
By Sean Derrington
DBMS vendors are focusing development on warehouse query processing, support for complex data types, and Web integration. By 1999/2000, DBMS vendors will have begun integrating selected middleware and application functionality directly into their DBMSs.
Meanwhile, IT organizations continue to struggle to optimize database performance, particularly in very large database (VLDB) applications. Most organizations have come to terms with the inevitable, and often uncontrollable, growth of data (upwards of 100% per year for data warehouse installations) throughout the enterprise. Furthermore, organizations are confused by the message some vendors are telling users: Mirror the entire database. Mirroring is likely an unnecessary expense (the DBMS could perform better if tuned properly) and is possible now that DBMS vendors are relinquishing stringent control of file placement.
Through 1999/2000, as the size of DBMS applications continues to increase, the opportunity to place specific database objects (files) on specific devices will become more feasible. Through 2001/2002, Fibre Channel (and SSA within AIX environments) will have become widely deployed, and organizations will have the ability to consolidate multiple DBMS storage requirements. In turn, tuning DBMS applications for storage I/O will become a paramount task for all database administrators and storage managers.
Furthermore, as organizations adopt a federated DBMS architecture, understanding the object I/O characteristics of each DBMS and matching them to appropriate storage I/O characteristics will become valuable. Organizations should begin by knowing the five key attributes to examine when evaluating storage I/O requirements:
- I/O size
- Access pattern
- Access frequency
- Read/write ratios
- Availability requirements
Since each object possesses all five of these attributes, yet emphasizes different criteria, multiple RAID characteristics pairing are often yielded (e.g., high I/O access frequency may be more applicable to RAID-5 than large I/O block size). To achieve optimal database performance, it is imperative that organizations match the DBMS object characteristics to RAID characteristics.
However ideal, placing each object type on a separate device may not be feasible because of cost constraints, and compromises must be made. It is the tight teaming of database administrators and storage managers that will enable organizations to achieve highly tuned database applications.
Looking at the Oracle database as an example, there are four things that generate I/O: the database writer, log writer, shadow/server, and archive process. Within these four I/Os, three types of objects are generated: control files, redo logs, and tablespaces. In addition, multiple tablespaces exist and require separate consideration: system, temporary, rollback segment, and user tablespace. Therefore, a total of six Oracle object types exist, each with unique I/O characteristics as well as a possibility of seven different RAID levels. The following identifiers will help match each object to the appropriate RAID level:
- Redo logs: Sequential, large block writes during normal activity; sequential reads during log archival and recovery process; heavy access; reliability and availability are very important.
- Control files: I/O access only during major events (startup, shutdown, checkpoint execution); therefore, mostly write activity with low performance demands; small file size; reliability and availability are very important.
- System tablespaces: Primarily random read access with low I/O rates; large capacity requirements; reliability and availability are more important than performance.
- Temporary tablespaces: Typically, large sequential access with 50/50 read/write ratio; capacity requirements are extremely important; reliability and availability are not.
- Rollback segment tablespaces: Tricky to determine, depending on system cache; predominantly sequential writes, data rates get heavy moving to 50/50 read/write access with heavy concurrent use; performance is important; reliability and availability are very important.
- User tablespace: Application dependent; significantly different for data warehouse applications (heavily read oriented) versus OLTP (heavily write oriented). Binary large objects, or BLOBs (e.g., image or audio files), may also be within user tablespaces; read mostly; very large blocks.
Again, though desirable, it is not always plausible to place each object type on its own device. However, IT organizations must understand the golden transaction rule: One transaction generates index, table, temporary, and log activity. These four I/Os should be executed simultaneously, not sequentially.
Therefore, at a minimum, try to keep each object on separate physical disks.
Bottom line: For optimal database performance, IT organizations must intermix database administration rules with storage I/O knowledge to the process of physical mapping of objects (files) to storage devices.
The diagram is one example of how to tune a small portion of a storage subsystem, in this case MTI`s Gladiator 3200. This example examines two (control and redo logs) of six Oracle object types. The number of device interconnects (SCSI buses) must also be carefully examined so as not to create a single point of failure (SPOF). A SPOF is eliminated by using a single SCSI bus for each device within a given array, or each mirrored LUN. With the exception of RAID-0 or a single copy of a mirrored LUN, disks in the same array should never share SCSI buses.
For example, the mirrored (shadowed) LUNs must be on separate SCSI buses to eliminate a single point of failure.
Redo logs are sequential, write-intensive files that are created with each database transaction, except reads. A minimum of two redo logs are required, and they must be highly available. The two-drive stripe set improves sequential write performance. Its shadow copy ensures availability. The redo log is read-only when archiving or rebuilding the database.
Control files exhibit extremely low I/O activity. They are only written to during a major event, such as startup, close down, or when taking checkpoints. Performance is of no importance, but availability is very important. A two-drive mirror set fulfills this requirement, and the low activity rate enables it to share the same RAID controller with the redo logs.
Terms and definitions: L0, L1, L2, L3, L4, L5: logical disk partitions numbered 0 through 5. R0: RAID-0, or striping. R1: RAID-1, or mirroring. 2.1GB or 4.3GB: Raw capacity of each physical or logical partition. Note: Example is based on MTI?s Gladiator 3200 RAID array with 64MB cache.
Sean Derrington is a senior research analyst at the META Group consulting firm in Stamford, CT.