Storage virtualization helps Purdue University manage storage complexity, facilitate change, and reduced costs.
By Dave Vellante and David Floyer
—Wikibon was initially skeptical when EMC cited Purdue University as a key reference for its Invista 2.0 virtualization engine. Several Wikibon members indicated that a university environment was not the classic reference model for mission-critical business applications. While this is generally true, we found the environment at Purdue to represent the type of diversity and complexity that virtualization engines are designed to address. We feel the situation at Purdue represents a typical for-profit company in industries such as wholesale distribution and retail. Moreover, while Invista 2.0 is in the early production phase at the university, Purdue IT management is astute and in our view is applying several best practices in storage. This was encouraging, given the well-publicized difficulties EMC was having getting earlier versions of Invista adopted.
Purdue University three IT support organizations: research, teaching/learning, and business applications.
IT support for research is decentralized, with only shared networks centralized. The IT support for teaching and business applications is centralized, and there is close alignment between IT and users. Business applications supported include typical university applications providing services to students and faculty.
Eighteen months ago, storage management was "out of control", according to Jon Miller, storage administrator at Purdue. Utilization was very low, and most of the SAN-based storage was not appropriately tiered. Moreover, the application staff was too heavily involved in the management of storage. There were 14 separate SAN switches and a lack of continuity in storage management, which led to a fragmented design and inefficient administration of resources.
The SAN fabric was of a mesh design, and switch ports were consumed in multiple interswitch links (ISLs). To bring this under control, the SAN fabric was redesigned into a core/edge topology, with two Brocade director-class switches implemented at the core. Appropriate tiering was implemented on mainly EMC DMX and Clariion arrays. File services were centralized on EMC Celerra NAS systems, and archiving was managed on EMC Centera platforms.
While this strategy helped to improve efficiency, the IT department still found it difficult to optimize the allocation and management of storage on the SAN. Once storage was allocated, it was extremely time-consuming to add capacity and move it. These tasks required significant planning and the help and cooperation of the platform administration groups to achieve any change.
The Purdue IT department decided that the next stage of storage infrastructure improvement was to virtualize the storage environment. This would enable Purdue to centralize storage management and to allow storage to be migrated seamlessly without impacting users. Importantly, it would also allow lower-cost storage arrays and storage management software to be deployed.
The Purdue IT department looked at virtualization offerings from EMC, Hitachi Data Systems, and IBM and chose EMC's Invista as the least disruptive and lowest risk option. The Invista platform will eventually support all SAN arrays except Exchange e-mail and backup. Invista is at an early stage of production deployment and will be fully deployed over the next six months.
The SAN consists of approximately 200TB of storage. Tier-1 storage is on two EMC DMX-3 arrays (36TB) that support the ERP and VISTA/WebCT applications in particular. The DMX-3 arrays are part of the project, and data from the DMX-2 arrays is being migrated using Invista. Other infrastructure includes multiple Clariion arrays (135TB total), one Sun array (25TB), and two Centera arrays (9TB). The servers attached to the SAN include about 30 Windows and 70 Unix systems, supporting native machines and about 200 VMware machines. The two Brocade director switches and 14 Brocade fabric switches support 640 ports. There is no storage chargeback system in place.
The applications supported by the SAN include e-mail, financial, educational applications, HR, housing and food services, physical facilities, student support systems, and other applications. The main database systems supported on the SAN are Oracle 9i and Oracle 10g. The main ERP system is SAP. Additional details can be found at www.itap.purdue.edu/business/.
Key storage challenges Purdue cited include the following:
- The cost of EMC DMX storage is significantly higher than EMC's modular arrays;
- Migration of storage from array to array takes planning and involvement of platform and applications groups to execute. It is very difficult to achieve and requires an extensive amount of time (months) to complete;
- Although there has been improvement since the SAN reorganization, there is still low storage utilization and a user perception of inflexibility in IT storage allocation; and
- SRDF on the Symmetrix arrays requires multiple copies of data and expensive software. It is overkill for many of the applications that use it, and many of the databases and systems now have lower cost alternatives for remote replication.
After evaluating several products (see table), Purdue chose Invista for its storage virtualization strategy. The organization acquired two Invista appliances providing mutual backup for control and metadata regarding where the data is stored. All SAN block-based storage (with the exception of the Exchange e-mail system and the backup systems) are planned to be positioned behind Invista, including Tier-1 DMX-3 arrays.
Purdue is investigating upgrading the two Brocade core switches with a special blade to enhance performance if needed.
Most of the current storage management software from the DMX and Clariion arrays is being retained, allowing the least change to existing application and storage management procedures.
Business continuance procedures that use EMC's Symmetrix-based SRDF are being modified to use either native Oracle 9i or 10G database support for remote replication or Invista-based remote replication. (See table, below.)
Moving responsibility for storage allocation, performance, and reliability from the platform and applications groups to the IT storage group could take a considerable amount of time to implement and would require senior IT management support. As Purdue's Miller says, "It will take time for the concept of virtualized storage to take hold. Platform groups have been 'array-centric' and will not be comfortable with the idea of storage administration moving things around."
Purdue's storage group must establish a track record of efficiency, responsiveness, and competency so that platform, application, and user groups are confident that storage provisioned meets requirements. Similarly, the storage group must ensure user satisfaction with business continuance after any move from SRDF to Invista-based remote replication or other solutions.
Finally, if Purdue chooses to add blades in the Brocade switches, further testing and integration will be required. Notably, this upgrade will require Purdue to migrate from Invista 2.0 to Invista 2.1, and it is unclear how EMC plans to support migration non-disruptively.
A major benefit of the project is that Purdue plans to reduce expenditures by using more lower-cost Clariion Tier-2 arrays. Purdue also plans to migrate most open systems usage of SRDF to the remote replication capabilities of Invista and to cut both software licensing and maintenance costs and the amount of storage required to support business continuance.
The project has also begun to reduce dramatically the number of people who are aware of changes to storage, especially where migrations are involved. Thus far, Purdue has seen advantages in terms of the storage group's being more responsive and less of an obstacle.
In addition, Purdue plans to achieve much higher utilization of its SAN-connected storage assets at a significantly reduced cost and use this success to reduce the amount of storage consumed. Purdue also hopes to avoid increased staff to manage platforms and storage as a result of the project.
Based on this project, Purdue advises the following to peer organizations:
- Determine the best solution for your business and then fit it economically;
- Be certain the environment is ready for storage virtualization; watch out for "weak links" in the infrastructure because you're putting all eggs in a virtualization basket. This includes having a SAN that is robust enough and as well as adequate switch port capacity. Also, ensure firmware and patches are up to date;
- Develop a migration strategy and get buy-in from administration and clients; and
- Get the best price.
Wikibon draws the following conclusions from this case study:
While Purdue was happy with the performance and reliability of Tier-1 Symmetrix arrays, it wanted to use lower-cost Clariion arrays where possible. The university has confidence in EMC as its supplier and wanted to leverage existing relationships. EMC created a financially attractive offering by packaging additional Symmetrix boxes with Invista together with deep educational discounts. EMC successfully sold the potential scalability of Invista using the split-path architecture as a competitive differentiator.
Because Purdue's applications do not all require SRDF resilience, recovery can now be satisfied with less expensive methods. The organization is reducing its software costs by using remote replication with Invista, for example.
Virtualization is now ready for general adoption, with EMC joining other established players and start-ups to create significant market momentum. One of the biggest benefits of virtualization for Purdue will be the ability to non-disruptively migrate storage with no application downtime and no involvement of platform administrators.
However, Wikibon believes that there are three weak points in the strategy:
- Purdue could make significant additional savings by reducing the number of separate but similar storage management capabilities. This could be done by tightly defining storage tier functionality and rationalizing overlapping functionality (e.g., copy services on Symmetrix, Clariion, and Invista);
- Purdue does not have a strong dictate in place to stop users and application administrators from using higher-performance/function storage rather than using the storage allocated by the storage group. It may take a long time after virtualization is implemented for decisions on allocation of storage tiers to be based on actual user requirements rather than user preference. Senior IT management and Purdue University user management will need to be proactive in driving change; and
- The additional complexity of a fourth component (SAN switch blades) to coordinate firmware changes (along with SAN switch, servers, and storage arrays) is likely to reduce the choice and timeliness of storage, SAN, and server upgrade options.
On balance, however, Wikibon concludes Purdue University made a sound business decision in electing to implement Invista from its established supplier. All the proposed vendor solutions were viable, but the EMC solution was low risk because of the least change required to existing processes and procedures, and familiarity and comfort with how EMC operates. Purdue is implementing a strategy that, if well-executed, will put in place a significantly more cost-effective storage infrastructure. It should improve storage utilization and flexibility and should enable storage to be potentially handled more efficiently by storage administrators, rather than by system administrators.
Dave Vellante and David Floyer are co-founders of The Wikibon Project, an open community of practitioners, consultants, and researchers dedicated to improving technology adoption. The authors can be contacted at firstname.lastname@example.org or email@example.com.