TCO and ROI are part of the justification for SANs, but for most organizations the real bottom line is measured in more-tangible benefits.
By Alan R. Earls
The U.S. Air Force's 45th Space Wing, headquartered at Patrick Air Force Base near Cocoa Beach, FL, exemplifies an almost literal "mission-critical" storage area network (SAN) customer. It supports and manages space launch programs for the Department of Defense, NASA, and commercial entities, and generates large amounts of data having to do with space launches, all of which must be backed up or stored to protect against data loss.
"The threat of hurricanes and flooding on the Florida coast makes it absolutely critical that we back up our data," says Glenn Exline, manager of advanced technologies at the Space Wing. "With our previous storage system, backup took up to 14 hours every day. With our SANs, we've reduced that to only two hours a day."
The Wing's SAN success also could be considered an argument that storage networks can be justified by something other than a return on investment (ROI) study. The organization's SAN installation not only helped dramatically reduce the backup window, but it also helped to consolidate a network from 52 servers to 35 servers, simplifying server management. Furthermore, with fewer servers on its network, the Wing expects to spend less going forward to refresh its servers as part of the replacement cycle.
While some organizations have managed to put together ROI and TCO studies to support the argument for SANs, other organizations find that bean counting isn't worth the effort. Instead, they cite clear productivity gains, new opportunities, greater reliability, and other benefits that they say make SAN economics a no-brainer.
According to John Webster, founder and senior analyst at the Data Mobility Group consulting firm, when a SAN project is in the offing, "everyone starts showing up with an opinion—or a concern." Often, people are on the defensive, concerned that change of any kind could have a negative impact on how they do their job. "They come to the table saying, 'convince me.' Many of those people will be unmoved by any form of ROI argument. I tell vendors they need to help their customers allay the fears of internal customers, particularly corporate security officers and database administrators, if they are going to sell SANs successfully," Webster says.
An illustration of the need to sell internally—and the potential results—is provided by Bob Massengill, manager of technical services at the Wake Forest University Baptist Medical Center (WFUBMC), a medical center and teaching hospital in Winston-Salem, NC. Massengill's organization implemented its first SAN in 2000, primarily to address two problems: an e-mail system that kept suffering from disk failures, and a desire to spend prudently when the mainframe disk systems went off lease. "I was faced with the choice of staying with mainframe-only storage or going to a SAN and also needed to do something about managing the disks on hundreds of servers," Massengill explains.
Keeping everything up and running had become very labor-intensive. What's more, although many of the servers, particularly those running Windows NT, were configured similarly, "we kept running into the issue of getting halfway through a project on a particular server and running out of disk space." The only practical solution was to upgrade that server with more disk storage. Meanwhile, though, plenty of other servers had scarcely tapped their available storage—meaning that overall capacity utilization was very low.
But getting buy-in on the SAN from the various departments that felt they had ownership over particular servers wasn't easy, recalls Massengill. He sold the initial part of the implementation as a "proof of concept" built around an EMC Symmetrix array, which was originally set up to support just e-mail applications and the mainframe. "At that point we felt it would eventually grow tremendously, but from a funding standpoint we couldn't bank on it," says Massengill.
He needn't have worried. By late 2000, with the Symmetrix fully mirrored, disk failures stopped, mainframe I/O waits disappeared, and people retrieving old e-mails were "blown away" by the improved performance. "Word spread fast and pretty soon we had departments lining up and saying they wanted to come aboard," recalls Massengill. What he did have to worry about, in fact, was being able to grow fast enough. "We started putting more users on the Symmetrix and populating our 16-port McData switches, and within six months we had outgrown them," he recalls. That led him to upgrade to director-class switches from McData. "We put in a 64-port Intrepid and kept the two 16-port switches in place," he says. By then, storage on the SAN had rocketed from just over 2TB to about 8TB. It now stands at 45TB.
But the SAN had not yet solved all of Massengill's problems. For the hospital, delivering the highest level of patient care and medical education depends on safeguarding data and ensuring that hospital information systems are highly available. Additional urgency was provided by the expanding security and data retention requirements mandated by HIPAA.
"A key issue for us was data integrity," explains Massengill. "The Medical Center was experiencing repeated media failures and corrupted data on tapes. Because the hospital was leasing the tape drives, it decided to replace them with StorageTek T9940s when the leases expired early this year."
Concerns about capacity and centralization of backup operations prompted the IT department to search for a new solution. The Medical Center needed additional storage capacity for its Unix and NT servers. They also wanted to centralize backup operations in an environment that included an IBM zSeries 800 mainframe, a Hewlett-Packard NonStop Himalaya S7000 server, and a network of Unix and NT servers.
The mainframe houses the patient accounting system, while patient treatment records reside on the Himalaya. The Medical Center uses the Unix servers for PeopleSoft financial and human resources applications, Web, and intranet purposes. According to Massengill, the goals of the expanded SAN project were to increase security and integrity for medical records data, patient accounting records, and hospital administrative data; decrease the number of hours needed for daily backups; decrease the number of tapes needed for backups; and free the IT staff from tape mounting chores.
But, notes Massengill, the SAN wasn't intended to simply solve those immediate needs. As part of the new SAN, a StorageTek Virtual Storage Manager (VSM) system was added "to help us maneuver as the inevitable changes occur down the road," he says.
The expanded SAN was implemented early this year. "In October 2002, we brought the open systems servers over to StorageTek PowderHorn tape library silo, and in February 2003, we implemented StorageTek VSM and the mainframe piece going into the STK PowderHorn."
According to Massengill, incorporating the tape library in the SAN led to a decrease in tape count from about 30,000 to just 600. It also saved floor space and tape mount time. Before the project, MVS operators were mounting 1,000 to 1,200 tapes each day, he says. Mount times were approximately three minutes. "Today the staff has considerably fewer tapes to handle and mount times are in the sub-second range," he says. "That means IT staff has time for system monitoring, proactive maintenance, and training."
Integrating the tape libraries and SAN also significantly decreased media failures and decreased backup time by three hours per day. "In the past," says Massengill, "batch cycles ran until 6 AM, so users were unable to access the system until those jobs were complete." However, hospitals begin admitting patients for day surgery at 6 AM, so if jobs ran beyond that hour it impacted the admissions and medical staff. "Today those batch jobs are finished by 3 AM, offering more system access to users," he says.
Performance is key
Becky Davidson, data manager for EDS/Sara Lee Bakery Group, in St. Louis, faced fewer complexities in justifying a SAN, but ultimately found similar proof points. She says the SAN was implemented to address several issues, particularly the need to increase the number of connections into the company's IBM Enterprise Storage Server ("Shark"). "We wanted to have dual paths and we wanted to attach a large number of machines," says Davidson. Minimizing the cabling hassles associated with SCSI was also a driving force behind the SAN. Yet another goal was increased performance.
"We didn't have much problem justifying the SAN either within the account or within EDS," says Davidson. In part that was because Sara Lee is growing rapidly so it was easy to show that without a SAN they would run out of attachable disk space. "Our customer was also very concerned about performance, and we could prove that multi-pathed Fibre Channel was faster than SSA or multi-pathed SCSI," says Davidson. The EDS team was also able to show that there was a greater savings per GB with a SAN.
Selecting a SAN vendor was also easy. "We run IBM servers connected to IBM Shark and IBM tape drives so we went with IBM," she says. Initially, they started out with three 16-port 1Gbps switches from Brocade—one for development and two for production. As they grew they added a fourth 16-port switch for backup traffic. Subsequently, they added three edge switches to connect a 21-drive tape library. Altogether, they now have two 32-port 2Gbps switches, three 8-port 1Gbps switches, and the four 16-port 1Gbps switches.
"The biggest benefit was flexibility," says Davidson. "If we want to add another system we can just plug it into a switch, zone it, and we're good to go."
On the downside, Davidson says that zoning can be complicated and the fabric concept can also be a little overwhelming. Fortunately, Brocade's management GUI lets Davidson check things out on the switches even though the data center is located somewhere else "so I can tell what is plugged in, what is active, what is having problems, and what isn't working."
Davidson says that the SAN solved a lot of problems, "but it does add more things to keep track of."
Even more upbeat in his SAN assessment is Gary Pilafas, a senior storage and systems architect at United Loyalty Services, a unit of UAL Corp. (the parent of United Airlines), in Elk Grove Township, IL. United Loyalty Services operates an OLTP engine that requires data replication.
Pilafas uses Hitachi's Shadow Image to do point-in-time copies from one data center to another. With the SAN, says Pilafas, it is possible to replicate and then run queries against one server and reports against another. "You can't do that in a DAS [direct-attached storage] environment without a lot of pain," he says. Replication over an IP network is accomplished with the Fibre Channel over IP (FCIP) protocol.
In Pilafas' case, justifying the SAN was easy because the company had been locked into an expensive hosting arrangement with a storage services provider, so a complete ROI analysis happened very quickly.
Between 2001 and 2003, United Loyalty Services' SAN grew from 4TB to 25TB.
Pilafas says the SAN has also proven its worth because it provides high availability, which can support business continuance and business resumption capabilities. And the SAN also supports the united.com revenue engine—the e-commerce front-end for UAL Corp.
Not so simple
But sometimes even positive results can be colored by doubts. Xcel Energy, in Denver, developed a plan to implement a SAN in 2000. At the time the project began, "we thought we needed more things than perhaps we did," says David Czech, a storage systems engineer at Xcel. Xcel's SAN is based on a Hitachi Data Systems Freedom 9960 array, McData switches, and a StorageTek tape library.
Xcel's SAN goals were to consolidate storage, simplify management, and increase capacity utilization. In addition, the company wanted to be able to easily replicate to another facility and to share tape devices.
"Some of these things happened and some did not," admits Czech. While the SAN did make it possible to consolidate to fewer storage devices, nothing got simpler. "I don't think it made anything easier to manage, and now there are a lot of extra steps involved in allocating and troubleshooting things that are attached to the SAN," he says.
Still, Czech characterizes the problems as mostly annoyances. And he is now looking at software tools to help with SAN management and storage resource management (SRM). Longer term, Czech says he is looking at the overall corporate storage infrastructure and the potential for virtualization. He is also looking at the potential for consolidating RAID arrays in the SAN.
Richard Boud, technical account manager with Technica UK Ltd., an independent storage consultancy in the UK, sees broad and steady movement toward SANs among his customers. He says the primary benefits that customers see as drivers for implementing SANs are the following: flexibility in storage allocation; adapting to changing business requirements; increased functionality, especially backup enhancements (i.e., snapshots) and disaster recovery (i.e., remote replication); improved data availability; centralization, especially for common functions such as backup; and simplified storage ..management.
Boud says that ROI and TCO measurements "are talked about a lot, although probably more by vendors than anyone else. Most customers take ROI and TCO claims with a large pinch of salt. They don't have the time, tools, or accurate data to evaluate potential or actual ROI. Instead, they prefer to be presented with credible justifications and to make use of common sense to tell them where technology can genuinely improve their infrastructure."
Alan R. Earls is a freelance writer in Franklin, MA.
Consolidating SAN 'islands'
By Robert Strechay
The growing need to consolidate isolated storage area networks (SANs) stems in part from the fact that SANs were configured in an ad hoc fashion to support particular applications. These SAN "islands" sufficed for a while, but then applications were extended and storage requirements expanded.
At the same time, IT professionals' mandate was to "do more with less." Even as the number of SAN islands swelled with more traffic and data, fewer dollars were available to hire experts to re-architect the SANs.
Consequently, companies are recognizing that the only way to grow while achieving return on investment (ROI) is to consolidate their data-center resources and applications, increasing the density and scalability of their storage networks. But this requires careful planning of future storage networks that will scale beyond the island concept.
The emphasis companies place on business processes to reduce downtime, management complexity, and cost varies. These factors are often not part of the SAN design process, yet they should be.
Regardless of application and business needs, most IT professionals are busier than ever, thanks to the economic downturn. Layoffs and downsizing have sent personnel scrambling to support more and more business units, yet applications continue to roll out and data accumulates hourly.
Due to a shortage of resources, many IT departments took shortcuts with their early SAN deployments. They counted on performance gains without additional management overhead and troubleshooting. When some SANs didn't perform well, or traffic grew too rapidly, departments often hooked up their own direct-attached storage (DAS). If the applications using DAS outperformed the SAN-attached applications, the business units would demand that corporate IT revert back to supporting DAS for their applications.
The applications probably would have worked fine on the SAN if it had been properly designed. Unfortunately, few IT shops had the necessary expertise. And the way SANs were marketed didn't help either. SAN equipment was often sold through OEMs—usually the major system vendors. In this context, SAN equipment was a sliver of the overall purchase. With vendors and customers giving little thought to future growth of the storage network infrastructure, no wonder many early SAN installations disappointed.
Yet, many other OEM-supplied SANs worked fine. They met many customers' need to support application deployments and to inexpensively leverage existing storage arrays. The users also could leverage their OEM service contract to get help with deployment and ongoing SAN management.
Storage network maintenance—ranging from executing moves, adds, and changes to adding or combining logical unit numbers (LUNs)—often requires a service call and an on-site visit from an OEM technician. This OEM dependence reduces the IT department's visibility into storage network issues.
Outgrowing a SAN usually means adding another island, or growing the existing SAN with additional switches and inter-switch links (ISLs), which can be complex and lead to interoperability and performance problems. Unpredictable ISL performance has in part led to the deployment of isolated SAN islands.
In addition, SAN management is 5 to 10 years behind traditional LAN and WAN network management. Common LAN management functions, such as notification of congestion or other faults, are not widely available on SAN switches.
Although SAN islands have proven value, many companies are outgrowing them. While they add more and more SAN islands, they reduce the ROI of the currently deployed SAN islands due to the higher cost of management. It's time to plan for strategic, long-term storage networks.
Getting everyone on the same SAN consolidation team is difficult due to staff shortages and political issues. Few talented storage professionals exist: There's perhaps one storage architect for every 10 LAN/WAN network architects. This reflects the fact that storage is not viewed as part of the network side of IT but, rather, as an outgrowth of the applications side.
Storage has transitioned into the server administrator's domain. But in most cases, server administrators are too busy to learn enough about storage networks. Compounding matters is the longstanding mistrust between server administrators and traditional network administrators (neither group thinks the other is responsive enough), which further complicates SAN consolidation projects.
Despite these difficulties, some organizations are beginning to respond to the fallout from SAN island growth by looking at their storage networks the same way they do their LANs and WANs. The political rationale is obvious: Business units are happy with the LAN and WAN services, so let's treat SANs the same way. IT organizations also appreciate the reliability, predictability, scalability, security, and visibility they gain by having a unified architecture, as opposed to an ad hoc collection of separate networks.
Put the "N" back in SANs
Today's SAN islands are typically fewer than 100 ports, although many IT shops anticipate growth into hundreds of ports. SAN consolidation refers to either connecting different SANs together or collapsing multiple islands into larger, better-coordinated SANs with shared resources.
Further stimulating the drive toward SAN consolidation are new disk array architectures that allow multiple applications to take advantage of multiple front-end ports. This is another reason organizations are looking to consolidate their data centers, servers, and storage. In doing so, they should consider re-architecting their storage networks for the future. This is especially true for organizations that are considering implementing "storage pooling" or "virtualization," which can complicate some of the issues currently present in SAN deployments.
By treating SANs more like LANs or WANs, storage network architects can operate with a higher internal profile and be able to consult with business units, management, and other IT units. This will help them do a better job of planning for SAN scalability, manageability, security, and availability.
Several switch vendors are addressing some of these issues with a new class of "intelligent" switches, characterized by better management, better network segmentation, high availability, and scalability. But whatever additional SAN gear you consider, think in terms of building a storage backbone network and deploying resources in a more controlled manner.
As corporations grow, so must their SANs. Until recently, SANs were not very flexible, and most were proprietary. But the industry is maturing and vendors are introducing more-flexible SAN products.
They are also working together in various associations, such as the Storage Network Industry Association (SNIA), to develop standards. These are all good signs, but for SANs to consolidate and grow they need to be designed in a different manner than they have been so far.
Companies need to understand their application requirements and then explore a backbone network with a tiered architecture. By treating SANs more like LANs and WANs, they will not only produce more durable and expandable architectures, but also enlist more support and input from other IT groups and business units.
Robert Strechay is director of technical marketing and chief customer advocate at Sandial Systems (www.sandi alsystems.com) in Portsmouth, NH.
Make the most of your HBAs
By Mike Smith
To some end users, a host bus adapter (HBA) provides only a standard physical and data connection from a server to a storage network. But storage administrators know that HBAs gather information about the status and activities of the entire storage area nework (SAN), and that tapping that information can help them manage more storage, more reliably, at less cost.
A Fibre Channel HBA is like a computer on an add-in card, with its own processor and real-time operating system. Unlike a network card, which simply passes packets between the network and server, an HBA performs much of the protocol processing work that a network card passes on to the server CPU and keeps track of the state of every I/O transaction.
For example, when an application on the server issues a request to the storage network, the server operating system passes that request to the HBA driver, which packages the request, puts it in an I/O queue, and then transmits it to a storage device or switch.
Throughout this process, the HBA driver maintains an awareness of the health of all SAN components and links, which can play a role in reducing costs while increasing the reliability and performance of networked storage.
The up-front price of a SAN can be dwarfed by the ongoing costs of managing, troubleshooting, and upgrading the network. These costs rise each year as the amount of data increases and the storage network becomes more complicated as users centralize data. Yet with all these demands, companies are under pressure to cut their storage management budgets.
Here's how choosing the right HBA can help:
Simplified firmware updates—Updating firmware for a single HBA is not a complicated process, but making sure the firmware for every HBA in a SAN is at the right revision level can be a daunting task—especially in SANs with hundreds of servers. Centralized management methods can monitor firmware revisions on both local and remote HBAs and can allow remote and simultaneous firmware updates on as many HBAs as needed. By eliminating the need to upgrade firmware server-by-server and allowing administrators to use a single tool to manage devices across the SAN, companies can cut management costs and increase SAN uptime.
Driver compatibility across product generations—Compatibility of drivers across multiple generations of HBAs enables storage administrators to install the drivers needed to keep the SAN running (or boost its capabilities) without going to the expense and effort of replacing HBAs. It also allows for a gradual introduction of next-generation adapters with minimal disruption. This simplifies management, protects hardware investments, and reduces the downtime needed to replace HBAs.
Reduced complexity—No storage administrator wants to scan multiple consoles to learn what's wrong in a storage network, much less physically track down failed devices. Applications that provide management of multiple types of SAN hardware allow administrators to reduce the number of applications required to manage the SAN, which reduces complexity and administration time. For example, an administrator could use the same management application to rezone a SAN and update HBA firmware. The benefits are increased administrator productivity and SAN availability and performance.
Centralized storage management—Intelligent HBA firmware can allow managers to reset or re-initialize ports on all the HBAs in a SAN, test data paths from a single point, and remotely upgrade firmware. This reduces management costs and planned downtime.
Problem detection/prevention—The real-time monitoring done by HBAs allows them to instantly detect physical failures of SAN links. It may take a server operating system tens of seconds to recognize such a failure. But the HBA can detect a problem within a second and instantly notify software agents running on the server to redirect the request through a functioning link.
This reduces downtime by speeding the fail-over process from a failed component to a working counterpart.
Most HBA vendors support the industry-standard Common HBA API, which provides information such as model numbers, firmware revisions, and topology for HBAs, but not other SAN components. However, HBA vendors are partnering with storage subsystem, switch, and management software vendors to use the information gathered by HBAs to provide SAN-wide monitoring and management.
Most HBA vendors also support management standards such as the Fabric Device Management Interface and security standards such as the Fibre Channel Authentication Protocol (FCAP) and Authenticated CT (Common Transport).
The more pressure you're under to deliver low-cost, highly available storage, the more attention you should pay to your HBAs. By making full use of the information that HBAs gather about SANs, administrators can cut costs and improve reliability.
Mike Smith is executive vice president of worldwide marketing at Emulex (www.emulex.com) in Costa Mesa, CA.