By Sean Derrington
IT organizations (ITOs) increasingly view storage as a strategic differentiator for existing and new applications. Many ITOs recognize the critical role information plays in the business, and leveraging that information requires an unassailable storage infrastructure foundation.
We believe forward-looking ITOs have decoupled storage and server selection. This, and storage solutions, are being viewed as much more than an afterthought for new applications and solutions. For most organizations, we believe this is a best practice, and the trend will continue as the storage market matures in technical capabilities and more IT professionals become well versed in the arts and values of storage. Even though current capital expenditures are under extreme scrutiny, leading ITOs are acquiring storage solutions on a tactical basis--yet, they are planning and selecting the vendors on a strategic basis.
Storage is moving to a services model, comparable to that of IP networking and (future) application infrastructure. Storage services will be tiered in function, price, and complexity, and delivered to applications and lines of business. For ITOs to deliver on this direction, storage (all aspects of it--disk, tape, disaster recovery, capacity planning, etc.) must be perceived and treated as an infrastructure component. Infrastructure components are common and reusable across as many consumers (e.g., applications and servers) as possible; storage should be no different. Consequently, this requires a storage infrastructure to be an initial consideration (not an afterthought) to drive business priorities. Moreover, infrastructure design and planning teams must have seats at the business project table and translate technology into business value.
Typically, IT architects create architectural principles and guidelines addressing three-to five-year planning cycles, while infrastructure teams design and plan component technologies (vendor- and product-specific), targeting current through 36-month requirements. The challenge going forward will be how ITOs handle the transition from storage infrastructure planning and design over to storage operations. This is a critical success factor for IT in general, but especially for storage management, because capabilities are rapidly maturing and vendors are rapidly innovating.
We believe that, through 2004, storage infrastructure and storage operations may be organizationally the same group--but, longer term (2005/2006), we believe these responsibilities will diverge and become distinct groups, mirroring current systems, network, and application infrastructure and operations teams. The objective of infrastructure planning is "to determine the scope, scale, and design of infrastructure necessary to provide application service levels required by the business in the short, medium, and long term." Furthermore, "the primary design goal for information systems must be to enable rapid change in business processes and in the applications and technical infrastructure that enable them!"
As we will discuss in greater detail in this article, numerous key priorities are in the strategic storage plan that will reduce storage infrastructure management cost and complexity. Implementation of automated networked storage (storage area network [SAN] or network-attached storage [NAS]), which rationalizes (reduces the number of) storage management applications and optimally integrates these functions, will need to leverage tiered storage software, hardware, and professional services to be able to measure (both internally and for the business) a successful storage infrastructure. Automated networked storage will provide businesses agility, dramatically increase productivity, and enable ITOs to manage terabytes and petabytes successfully and cost effectively.
Storage and application service-level agreements (SLAs)
Because all applications and servers are not created equal (as determined by business requirements), prioritization for storage services (from premium enterprise-class to internal workgroup storage) must be evaluated. By 2004/2005, more than 60% of the servers in the data center will be connected to networked storage, and we expect more than 70% of storage capacity to be networked in this time frame.
Although there can be modest increases in procurement costs to network storage, the resulting benefits in management productivity, increased utilization, agility, infrastructure and operations personnel savings, and scalability are substantial.
Servers sharing external storage resources can also aid in holistic capacity planning and increase overall capacity utilization (i.e., GB/TB, Fibre Channel [FC] fabrics, external subsystems--impacting data-center floor space, etc.). For applications that are more volatile, storage can be reassigned to the servers as the capacity threshold (60% to 80% utilization) is reached. This is an area where enterprise storage procurement and capacity planning are important (compared to internal storage). We generally recommend a six- to nine-month buying cycle for enterprise storage and a competitive storage environment. Overall, ITOs can plan on an approximate 35% per year decrease in hardware prices (8% to 10% per quarter) and, given the state of the current global economy, should proceed cautiously on long-term acquisitions (12+ months).
Enterprise storage management will likely surface as the most significant contributor to ITO operational savings, because human resources remain the highest data-center cost. They will be increasingly important as more servers participate in networked storage (the disk and tape SAN). ITOs must continue to remove the dependencies of operating systems with storage management. Many operational economies of scale and flexibility will be gained by employing such an approach (e.g., capacity managed per administrator). Certainly, storage management will still need to interface with the other operational disciplines (e.g., systems management, network management, and database administration), but a center of storage management excellence is the design objective.
The goal is to streamline and automate many of the daily tasks that consume staff resources. For example, during the past 24 months, FC fabric management has moved to a core function under the purview of many enterprise storage vendors. And we believe additional tasks, such as storage provisioning/allocating (reducing the number of individuals involved and the time required from hours or days to minutes), resource management (which users/applications are using particularly capacities), and topology mapping (which resources--physical or logical--are allocated to applications) will be among the first tasks within multi-vendor storage administration.
The key is to examine how and which operational tasks can be automated and leveraged (across server operating systems and multi-vendor storage hardware), enabling storage capacity to increase while storage infrastructure and storage operations staff remain constant--or, hopefully, decrease over time.
For automation efficiencies, storage operations should identify solutions that are seamlessly integrated--that is, a single application that can incorporate various functions (possibly multi-vendor) and not a loosely coupled launching pad (or iconic launch) for disassociated islands of automation. Solutions leveraging a central repository (persistent database) serving as the "storage management data warehouse" would enable ITO efficiencies gained through the function of individual "management components." Moreover, harnessing this repository and "feeding" (analogous to the real-time data warehousing feedback loop) information to other management elements or ISV applications (e.g., for chargeback/billing) will be of significant value. Without this next level of integration/consolidation, users will duplicate efforts, not achieve the ability to scale (e.g., performance and capacity) their environments and not control storage staff costs.
We believe storage provisioning/allocation will be one of the initial fully automated areas, yet this will require organizational (at minimum, functional) changes, because currently this task typically transcends individual and possibly departmental responsibilities (e.g., server, database, and storage). Simplifying storage allocation (which requires a multitude of individual tasks) is not a simple problem to solve. It typically involves the following:
--Assigning/configuring the appropriate LUNs from the physical storage subsystems;
--Identifying and allocating the number and location of host interconnects on the array;
--Masking the LUNs to the particular server(s);
--Zoning the FC switched fabric;
--Incorporating and modifying any necessary FC security policies (if available);
--Ensuring the volume manager/file system and the application and/or database on the host(s) are updated, and incorporating the desired changes; and
--Ensuring other storage management applications (e.g., backup/recovery, remote replication services) comprehend the changes and the implications of such changes in storage policies.
Center-of-excellence (COE) best practices dictate investigating the benefits/returns of consolidation and integration tasks, the most significant of which are those that have the greatest, most immediate impact on the costs that govern operations. Those gains will be seen as an annuity to the business, versus one-time savings. This annuity could be significant not just in dollars, but also in business agility and infrastructure flexibility.
Application/DBMS recoverability is one of the most compelling reasons for SAN participation. Based on the mean time to recovery (the time it takes the application to begin accepting transactions again--not simply restoring information from tape to server) discussed with the business, numerous technology options can assist in meeting those requirements. Snapshot or volume-based replication (either local or remote) can dramatically reduce the recovery times to minutes, preventing the need for a tape restore (but not completely eliminating the need for tape backup/recovery).
Although restoration is a significant and necessary capability, the "state" of the information/data can be much more important than the existence of the information. Recovering data in an inconsistent state--or data from one database that reflects a point in time that is significantly different from another reliant database--is often useless. Consistently managing replication and recovery options (possibly from multiple vendors) will prove to be operationally beneficial, as will rationalizing backup/recovery software and hardware across as many platforms as possible (often excluding the mainframe). Consistently managing various storage vendors' snapshot/volume replication can provide further operational savings, striving for policy-based automation. Moreover, just as all applications are not created equal, storage replication software is no different. Integration with operating systems, clustering software, applications, and databases, as well as quiescent time (time required for a consistent view of the application), will often differentiate offerings, given that replication software has become much more prominent during the past 36 months.
Data and media center of excellence
There are three high-level layers of the data and media COE: conceptual (policy), physical (infrastructure), and operations (operational). Each of these three is a superset comprising multiple functional and subfunctional elements. The functions within each category and associated job titles (suggested) are detailed as follows and serve as a guideline for functional and organizational structure.
It is also important to note that, although many of these functional categories exist in the mainframe environment, simply having mainframe personnel responsible for all systems is not necessarily appropriate. There must be a balance of people, processes, and technology to create an effective COE, and the technologies are often quite different between mainframe and non-mainframe environments. Nonetheless, COEs should be viewed holistically and include the following:
Policies offer the highest-level view of storage management, dictate the required infrastructure to meet policy mandates, and should be governed by a storage policy director (SPD). These policies, which are interdependent and should be viewed holistically, include the following:
--Service portfolios and service-level agreements--SPDs should determine what service packages they will offer to business units, as well as the associated cost/service tradeoffs, based on the needs of the business units.
--Data protection strategies--Not all data is created equal, and the data protection mechanisms should be aligned with the service portfolio.
--Vendor selection strategies--ITOs should determine, as a matter of policy, whether to pursue a best-of-breed multi-vendor strategy (albeit rationalized and not a vendor "du jour" strategy), or a "one-throat-to-choke" single-vendor strategy. This will vary from one category to another, but should be done with an eye toward strategic objectives, which are not always mutually exclusive. Either way, rigorous interoperability mandates should be a prerequisite for consideration to enable maximum deployment flexibility.
After the overall storage policies have been determined, a storage infrastructure architect should be responsible for designing and implementing the components needed to achieve the service goals. This will include device and network management (at the functional level) as well as a multitude of subfunctions, including the following:
--Physical topology design--In most organizations, it will be necessary to use and integrate SAN, NAS, and direct-attached storage either for optimum performance or for security requirements and data protection. The ability of vendors to not only function/co-exist, but also support (via customer and professional services expertise) these advanced heterogeneous environments (multi-protocol, multi-vendor, multi-function, cross-organizational, and cross-regional) will determine the most appropriate platforms (both hardware and software) on which to base infrastructure for the flexibility to adapt, as sub-elements of technology evolve over time.
--Performance modeling and monitoring--The storage infrastructure architect should be responsible for not only designing sufficient resources, but also for ensuring services levels are met. This includes storage components (e.g., disk arrays) as well as storage networks (e.g., FC fabrics) and working with IP networking personnel, in the case of NAS.
--Security--Security is an under-addressed area in storage management (due mostly to the current lack of capability), but should be made the responsibility of the storage infrastructure architect working in conjunction with the enterprise security architect.
Other subfunctional tasks may include physical subsystem and tape library design, fault and performance management, virtualization, SAN management, and enterprise planning/design/procurement.
Operational procedures will not vary greatly from current practices, but should be managed by dedicated storage administrators. Functionally, traditional systems management should remain within the systems management group, and the key is to begin separating the storage operations from systems, database, and application management. To what degree will certainly be dependent on each organization, but the three main functional disciplines are
--Performance management--This consists of performance modeling and management (including FC-based networks), asset management, chargeback, and billing by application of business unit.
--Resource management--This involves resource allocation or provisioning--the actual assignment of storage (e.g., LUNs, data zones, and volumes) according to architectural/application requirements, as well as quota management, usage management (e.g., by application or individual user), metering, and capacity planning.
--Availability management--This deals with backup/recovery operations. Backup/recovery should be included with storage management, not systems management. This will include backup/recovery scheduling, media management, and problem resolution. High-availability server clustering management (in conjunction with systems management), hierarchical storage management (where applicable), and replication (e.g., remote replication for business continuity) are also of consideration.
Similar to balancing the handoffs with enterprise architects, infrastructure, operations, and lines of business, the data and media COE is no different. As can be seen, some of the functions and subfunctions will transcend the policy, infrastructure, and operations categories. Communication among groups is a key element for a successful storage infrastructure that is adaptable to change and measurable.
ITOs should undertake the following initiatives:
--Rationalizing storage hardware and storage software--This should encompass all aspects of storage across the server and application portfolio. Certainly, storage life cycles must be considered, yet a rationalization (reducing variations--both vendor and configuration) strategy will provide significant strategic value, even in tactical times.
--Creating a storage infrastructure--ITOs should begin networking storage resources (SAN, NAS, and backup/recovery architectures), leverage tiered storage offerings (i.e., internal storage, midrange, and enterprise) and functional software (i.e., replication, server cluster integration, and backup/recovery), and look to common components (i.e., FC switches and host bus adapters) where possible. They should also seek new elements that adhere to intensive interoperability standards and procedures to ensure maximum configuration flexibility.
--Optimizing storage operations--This includes rationalizing and consolidating management tools and personnel responsibilities; consistent storage management of multi-vendor environments is beginning to emerge and is a significant strategic directive.
--Creating a data and media center of excellence--ITOs should employ a center of excellence using the guidelines outlined in the previous section.
Currently, but more so during 2003 to 2006 (as ITOs adopt a storage service delivery model resembling that of traditional outsourcers), accurately measuring storage services and metrics will be a critical success factor to ITOs not only gaining credibility, but also justifying (and making an informed decision about) whether a storage and storage management sourcing strategy should be employed. ITOs can use three categories when determining measurable metrics. The initial set comprises alternatives to the traditional ROI/total cost of ownership senior management mandate and is primarily an internal benchmark of (current and future) capabilities that quantify how the strategy is providing value.
The second and third sets (technical and operational) are more strategic because some of these measurements are not possible with some storage vendors. However, ITOs should keep the measurements simple, limiting the metrics to those that are the most important to the business. Often, using network-based SLAs (e.g., Web hosting) as a framework and mapping them to storage services provides a good starting point. The three categories available to ITOs are:
--Internal benchmarking--This includes storage managed per administrator, percentage of storage utilization, days storage in inventory, data life-cycle multiplier, data availability, and mean time to recovery.
--Technical measures--These include storage availability, storage latency, data migration/exit clause, diagnostic, capacity utilization, performance, resource utilization, mean time and maximum time to recover/resolve.
--Operational measures--These include maximum time to notify, moves/adds/changes, project-specific SLAs, and vendor responsiveness.
ITOs are increasingly realizing the importance of storage, particularly among mid-sized companies that have historically not had requirements for enterprise storage capabilities. Through 2005/2006, most organizations will have organized around and created a storage infrastructure and operations team (the data and media center of excellence). Consequently, it is imperative that ITOs begin to view storage with regard to the strategic importance it will have--even in difficult economic environments that often, and many times incorrectly, result in solely tactical decisions. ITOs should undertake a strategic evaluation, examining how daily functions can be leveraged across (and automated by) multi-vendor capabilities, and how tasks currently consuming significant resources can be dramatically reduced (e.g., recovery, provisioning, and procurement).
Moreover, as storage technologies continue to mature and more servers participate in networked storage (SAN and NAS), ITOs will be forced into measuring delivered capabilities as storage becomes a service that is delivered to the business in business speak (e.g., performance, availability, and flexibility to change), and associated cost tradeoffs based on their selection of tiered services is clearly understood.
Sean Derrington is senior program director at the META Group (www.metagroup.com) in Stamford, CT.