ASNP user case study: Chevron upgrades SAN with tiered storage

By Robb Dennis

October 5, 2005—Chevron has two US data centers—one in Houston and the other 1,600 miles away in San Ramon, CA. These data centers are the centralized service hubs for the business units located in their respective areas.

IT is handled by Chevron's Information Technology Company (ITC) unit, which provides IT services for the corporation's shared infrastructure. Chevron has more than 50,000 users in 186 countries. ITC is responsible for about 4,000 Compaq, Hewlett-Packard, and Sun servers worldwide running Unix, Windows, and Linux, and more than 50,000 HP desktops and IBM laptops, all running on a Cisco-based network. Each data center has its own metropolitan area network (MAN).

"At Chevron, IT is not allowed to drive the business," says Tony Jurgens, a storage technologist with ITC. "Instead, our customers—the various business units within the company—give us their objectives and we have to translate our IT actions into the realization of these business requirements." (Jurgens is a member of the Association of Storage Networking Professionals.)

In terms of storage, each data center has a mix of SAN, NAS, and direct-attached storage (DAS) from a range of vendors, including EMC, Hitachi Data Systems, and Network Appliance. The primary SAN resources are HDS 9960, 9980, and 9570 disk arrays, Brocade SilkWorm 3800, 2800, and 2900 Fibre Channel switches, and Emulex's Fibre Channel host bus adapters (HBAs).

Over time, storage and backup costs were rising steadily. ITC diagnosed the problem: Providing services for all data when much of that data was stagnant (e.g., unstructured data existing in spreadsheets, PowerPoint presentations, Word docs, JPEGs, etc., that had not been accessed in over a year). Out of 32TB of data in the Houston and San Ramon environments, a total of about 16TB turned out to be stagnant.

"Storage and backup were way too expensive and time-consuming," says Jurgens. "We had about 50% stagnant data and that was really slowing us down. Plus we were adding an average of 2TB of data per day to our systems worldwide."

Having so much data in its production systems caused problems. In terms of recovery, for example, it took 8 to 10 hours to recover 200GB to 250GB of data. Further, the company didn't have any formal disaster-recovery processes in place.

ITC was experiencing storage growth rates of 50% to 100% annually—well above their projections of 30% annual growth. ITC had to request additional capital and explain the higher-than-forecasted storage growth numbers. As a result, management demanded better storage management and reduced IT costs.

Jurgens analyzed the storage environment more closely, breaking it down into three key areas of storage management:

  • Capacity planning to forecast for added capacity;
  • Information/data protection; and
  • Data management, including what data should reside on which resources and how rapidly the data needed to be accessed.

"We needed one strategy to unify all three areas," says Jurgens. "That's why we opted for tiered storage."

Chevron decided to engage in a strategy to move dormant information to lower-cost storage resources. In this tiered storage environment, stagnant data is automatically moved from primary to secondary (nearline) storage devices via data-mover software.

To implement tiers of service for its users, ITC investigated newer technologies such as disk-to-disk backup/recovery, snapshots, etc. in order to integrate into the existing backup services. According to Jurgens, Chevron's tiers currently consist of

  • Tier 1—HDS TagmaStore disk arrays to host databases, data warehouses, and other online transaction processing (OLTP) applications;
  • Tier 2—HDS 9200/9570 arrays and some NAS servers for file serving, seismic files, spreadsheets, and e-mail;
  • Tier 3—ATA-based disk arrays and tape for static data such as reports, images, and nearline archiving; and
  • Tier 4—Offline tape for inactive data required for legal retention.

The data mover software used to migrate inactive data from Tier 2 to Tier 3 is EMC/Legato's DiskXtender 2000. The migrated data is not retrieved from secondary storage unless changed. Thus, when a user accesses an inactive file to look at the contents, it is read from the lower-cost storage device instead of being recalled to primary storage.

Network Appliance's NearStore R200 platforms comprise the Tier 3 nearline storage, which is used for a wide variety of applications. For example, it stores Chevron's growing volume of unstructured data that hasn't been accessed in over a year. ITC estimates indicate that this project will have an initial rate of return (IRR) of 66% over three years.

The NearStore R200 disk arrays host several applications, including the following:

  • Oracle Recovery Manager (RMAN) to make disk backups of Oracle databases. RMAN provides the ability to back up the database to any media (e.g., tape or disk). The focus of the RMAN backup project currently is to back up the databases to disk. Once on disk, Legato Networker backs up the files as normal flat files.
  • Application hosting disaster recovery for mission-critical applications. This involves backing up databases to disk and then snap mirroring to another data center. This project uses NearStore R200s for mission-critical backups. The SnapMirror software that comes with the R200s copies data between data centers.
  • Exchange e-mail documentsThis project, which is still in the prototype phase, stores e-mail documents (approximately 6TB) on NearStore R200s at the Houston and San Ramon facilities.

"By implementing a tiered architecture, ITC has achieved a 49% reduction in the storage rate charged to its business units," says Jurgens.

Prototyping hurdles
ITC conducted extensive prototyping on this project. One hurdle it had to overcome was that DiskXtender 2000's failback didn't work in an active-active Windows cluster. As a result, the supplier had to rewrite the DLL with help from Microsoft. From a migration standpoint, however, the software worked flawlessly. The result of this is that it makes the archiving process transparent to users (e.g., archived documents look like documents in the production system). According to tests, the response time penalty for data retrieval is negligible.

ITC also had to do several equipment migrations to improve performance and reliability. For example, it switched from HDS 9200 to HDS 9570 arrays. The goal was to improve the reliability of the aging Tier 2 storage. Since then, ITC has experienced no unplanned outages.

In terms of storage management software, Chevron opted for HP's OpenView Storage Area Manager (OVSAM). The company is leveraging the software's ability to provide capacity planning reports, trending analysis, and asset reporting. This allows ITC to provide more accurate budget forecasting.

Another project dealt with backups for the company's NAS systems. This involved the implementation of LAN-less backup for Network Appliance's F9xx and R2xx arrays. This has already been implemented for the F9xx devices, with an overall improvement in backup speeds. Using LTO-1 tape drives, ITC has achieved 60GB per hour in backup-and-recovery speeds. When LTO-2 is implemented, it is expected to double the throughput.

ITC plans to continue the upgrade of its storage environment by implementing disk-to-disk backup, initially for e-mail. Jurgens explains the necessity of this move: "It took us three days to recover an e-mail system using tape backups, and we needed a more immediate method of recovery."

Chevron uses Zantaz software to archive Exchange documents to lower-cost storage.

Moving forward
ITC is currently reviewing its backup infrastructure, some of which is based on five-year-old technology. The organization needs to first understand the impact of integrating new technologies such as snapshots, disk-to-disk backup, and electronic vaulting. ITC is also reviewing the potential of putting its Oracle databases on Tier 2 NAS versus Tier 1 and Tier 2 SAN arrays.

In addition, ITC is reviewing the potential impact of server virtualization and how it would integrate with the overall storage infrastructure.

Robb Dennis is a freelance writer and a member of the Association of Storage Networking Professionals (ASNP).

The Association of Storage Networking Professionals (ASNP) is a worldwide member organization of storage networking end users. It provides an open forum for members to discuss real-world problems and solutions related to storage networking. Through its regional chapters and annual conference, ASNP offers educational training and networking opportunities. Its 2,000+ members have exclusive access to the association's online portal, which features case studies, white papers, and discussion forums. For more information, visit www.asnp.org

This article was originally published on October 04, 2005