Applications-centric path to managing storage

Applications-managed storage makes sure that applications that require real-time management attention get it—and automatically.

By John Webster

A popular practice of storage vendors is to use statistical data to hype the implementation of an architectural model like storage area network (SAN) or the acquisition of a particular product such as a storage management application. The plot line runs something like this: Prior to implementation, enterprise X could manage only 2TB of data with its existing staff. After implementation, enterprise X could manage five times that amount without having to increase its management headcount.

Before-and-after graphics are commonly used to show how many more terabytes a single administrator can manage if a storage network were to replace a DAS environment or if the latest and greatest in storage management software were to be installed. It's a story that fits the economic climate well and, more often than not, it's one that is supported by data being generated by the analyst firms.

Here's the problem: Using capacity as a metric for management efficiency—i.e., measuring management efficiency in terms of the number of terabytes that can be managed by a single administrator—is at best misleading and at worst can lead to the unintended and negative consequence of IT administrators believing their organizations can effectively manage more data than is humanly possible with their existing staff.

We know that all applications do not demand additional capacity at the same rate. Some are stable and predictable in terms of their capacity requirements, while others can mushroom out of control if not throttled. Still others can be cyclical in their capacity usage patterns. Consider, for example, an application that tracks inventory for a retailer whose peak season begins in October and ends after Christmas.

Click here to enlarge image

Another commonly used metric is dollars. If you've moved around in storage circles for more than a few months, you've undoubtedly heard about a study that concludes that managing storage is x times the cost of the hardware itself. Depending on the author of the study, the x factor varies from 3.5 to 10. What is an IT administrator to do with these data points?

Users don't manage storage in a vacuum. They manage applications. Managing the storage associated with those applications is only part of the job, albeit an increasingly time-consuming and stressful part. They know from experience that some applications are lean and clean, while others are management hogs. And that's really the point. Users are helped most when they can

  • Identify the hogs, and
  • Build a storage infrastructure that eases the burden of managing the hogs.

The dual goals of applications-managed storage are to first identify applications requiring real-time management attention and, second, to automate real-time responses to the needs of the application in ways that have been pre-defined by IT administrators.

Applications-managed storage: What is it?

Applications-managed storage is the process of linking the requirements of the application directly to the storage environment, driving the needs of the application down through the storage stack, and then automating a response to those needs (see figure). Applications-managed storage is hardware and automated management software working together in the context of a user application to ease the storage management burden. It asks the application to be storage-aware and, by the same token, asks storage to be application-aware.

The concept of applications-managed storage may sound a bit absurd if you're used to thinking of storage as sitting on the bottom of the processing totem pole, with the application on top and everything else in between. Yet, users inherently know that not all applications are created equal. The perceived differences ripple down through the processing stack from the application to the storage environment such that users now typically define different classes of storage for different applications. Mission-critical applications get the mirrored, fault-resilient, high-performance, and consequently high-priced storage. Less mission-critical applications typically get less, at least in terms of value-added storage functionality.

This process has a shiny new buzzword. It's called policy management.

In truth, many administrators are already managing storage according to policies dictated by the requirements of their applications. One user, for example, may define four classes of storage. Policies related directly to applications—from mission-critical to auxiliary—are defined for each class and are used to provision the entire I/O path from the host down through the storage network cloud, to the array, and finally, to the drives within the array. The difference now is that, with the advent of intelligent storage fabrics, this process can be automated.

Today, a variety of storage management software products such as AppIQ's AppIQ, ProvisionSoft's DynamicIT, BMC's PATROL Storage Automation, and IBM/Tivoli's TSRM for Database exist that are both application-aware and automate the delivery of storage resources (e.g., disk capacity and I/O paths) to applications that need them, as they need them.

Picture the cockpit of an advanced jetliner. The jetliner flies on autopilot, while instrumentation gives the pilot a way to monitor and manually control, when necessary, all systems responsible for flying the plane.

Similarly, the management consoles display concurrent awareness of applications and storage, so administrators can match requirements with available resources and simultaneously monitor applications, storage, and processing environments either directly or through Common Information Model (CIM) interfaces to more comprehensive enterprise management products or frameworks.

Applications-aware storage management suites generally perform the following processes to reach their goals:

  1. Communicate with an entity directly associated with a user application, such as a file system or volume manager, to extract metadata. This process, in essence, uses metadata to "discover" the ongoing requirements of the application in real-time;
  2. Use network-based auto-discovery processes to map I/O paths from server to storage, including any and all associated physical entities such as host bus adapter (HBA) and switch ports, and processes such as fail-over or load-balancing mechanisms;
  3. Associate metadata discovery process (1) with network discovery process (2) such that an entire I/O path can be identified as belonging to a particular application;
  4. Automate the management of the I/O path based on the real-time interpretation of user application metadata and storage administrator-defined policies;
  5. Generate data based on the application-to-I/O-path associations that can be used to charge back departmental groups for storage resource usage on a per-application or per-server basis; and
  6. Integrate views of the server environment with those of the storage environment so users can perform "root cause" analysis of performance problems from a single console.

Two examples of recently introduced applications-aware storage management suites will help to illustrate how applications-managed storage works (see "Two vendor examples").

The CIM/WBEM way

Applications-managed storage based on open interfaces, rather than vendor-specific APIs, is inching closer to reality. The Storage Networking Industry Association (SNIA) is making great strides in its efforts to incorporate the Distributed Management Task Force's (DMTF) CIM into a storage management framework, and it has chosen the DMTF Web-Based Enterprise Management (WBEM) as a commonly accepted and understood interface for managing the storage domain.

Together, CIM and WBEM provide a single door for applications-based storage management development. The question is, will applications developers walk through that door?

Applied to storage management, CIM/WBEM makes it possible for application developers to look down into the storage domain and see an openly published set of management interfaces, which can then be used as hooks into the storage fabric. CIM/WBEM presents control points that applications can use to monitor the functioning of the storage fabric and then automate responses to certain conditions.

For application developers, the value proposition of CIM/WBEM is the ability to differentiate products on the basis of manageability. CIM/WBEM is robust enough to control and monitor—something not possible with SNMP MIBs. It is this two-way monitor and control capability that opens the storage domain to a wide range of management possibilities.

For application developers, CIM-enabled storage greatly simplifies the development task by enabling developers to write to one, rather than many, sets of open interfaces.

CIM came to life as a means to facilitate the management of computing networks in general—not just storage networks. By the same token, many storage management applications are emanating from more-comprehensive enterprise management suites, once called frameworks.

Users will have a choice between stand-alone storage management solutions or integrated enterprise management platforms for all computing resources—storage, network, and processor—from a single console.

John Webster is senior analyst and founder of the Data Mobility Group, a storage market research and analysis firm, in Nashua, NH.

Two vendor examples

ProvisionSoft's DynamicIT

DynamicIT monitors I/O capacity at the volume level via Veritas Volume Manager and responds to events affecting performance and capacity requirements. DynamicIT communicates through the volume manager interfaces (such as CLIs and APIs) to extract metadata pertaining to certain events associated with each volume. The metadata is used to make decisions about application requirements.

With DynamicIT, there is no communication with the application itself. Rather, DynamicIT assumes that volumes are directly associated with specific applications. However, when a volume is associated with more than one application—let's say a critical application—then the policies for provisioning storage to that critical application take precedence over less-critical applications that may also want access to that volume.

To understand how DynamicIT gets its application awareness to automate provisioning policies, consider the dynamic addition of storage capacity in a storage area network (SAN) environment. When an application adds enough data to a volume to trip a pre-set, user-defined threshold that warns of an impending out-of-storage-capacity condition, DynamicIT automatically goes out into the storage environment and brings the needed capacity online without operator intervention.

On the surface, this would appear to be a fairly simple, straightforward operation. However, automated capacity provisioning in a SAN is exceedingly complex, requiring the management application to acquire knowledge of a SAN's physical topology and logical constructs (e.g., volumes and LUNs).

First, the management application has to know the associations of volumes to particular application servers and storage devices, and all of the related interconnections in-between (including host bus adapters [HBA] and switch ports). Next it must find additional unused disk capacity somewhere in the SAN and then mount this capacity to the application through the volume manager and inform all other supporting functions (such as cluster fail-over mechanisms, load balancers, and LUN masking agents) that additional capacity is now online.

Done manually, the relatively straightforward job of adding capacity to a SAN becomes a complex, multi-step operation prone to operator error. When automated, the operation requires minimal operator intervention resulting in far less disruption to the production environment and thus greatly reducing operator error.

One of the interesting aspects of DynamicIT is that it addresses both the processing and storage requirements of an application needing resources. In other words, it discovers and monitors both processing and storage capacity available to an application and provisions the required capacity based on resource constraints that are monitored within the processing and storage environments.

AppIQ's AppIQ

In contrast to DynamicIT, AppIQ communicates with a database (currently Oracle and Microsoft Exchange) rather than a volume manager to make decisions about the real-time requirements of the application.

In the case of an Oracle database, AppIQ uses listener ports, Oracle $ tables, and Oracle Transaction Control Language (TCL) to glean the requisite metadata from the database. Using Oracle listener ports, AppIQ automatically discovers each Oracle instance running on a SAN or on direct-attached hosts. For each Oracle instance, it generates standard SQL statements to query Oracle's $ tables to derive metadata about the file system or raw volume object mounted to each Oracle instance. It then uses Oracle TCL to map database objects to storage objects. (The term "storage objects" here refers to anything in the I/O path from the HBA, through switch ports, down through an array controller, and further down to a disk spindle.)

In short, AppIQ builds a database of storage objects using its own automated discovery processes and then matches storage objects to Oracle database objects it has discovered through its query of each Oracle instance. AppIQ then draws maps of the physical I/O—from the server to the disk spindle—and logical I/O paths from each Oracle instance down to the individual LUNs.

Once the object associations are created, user-defined provisioning and performance policies can then be automated along the entire I/O chain from server to storage. For example, in the case of automatically provisioning disk capacity, AppIQ can allocate raw capacity from the disk spindle up through the SAN cloud to an Oracle table space, informing all of the associated processes along the way.

When the Oracle instance is associated with a critical application, AppIQ derives the needed capacity from the appropriate pooled storage. AppIQ can also leverage other CIM-based services that monitor and manage the usage of processing resources, and then combine views of the server environment with those of the storage environment.

This article was originally published on November 01, 2002