Building blocks for information life-cycle management

There is no silver bullet for information life-cycle management yet, but users can begin to build an ILM foundation with existing products.

By Jehoshua Bruck

Click here to enlarge image

Storage management challenges continue to escalate. Users expect instant, uninterrupted data access; administrators face increased scalability and performance requirements—and restricted budgets; and C-level executives need to ensure their information is protected, accessible, and retained according to the latest federal and state regulations.

In the face of this dilemma, companies need a new way to manage storage and many are looking to the concept of information life-cycle management (ILM). ILM offers IT organizations a better way to manage a wide variety of content, including dynamic Web content, traditional files, structured data, and digital media. The promise of ILM is to match this corporate content with the appropriate storage media to improve access, performance, utilization, and costs.

Unfortunately, many vendors are touting yesterday's backup-and-recovery tools as tomorrow's ILM solution. However, IT professionals need to understand the core building blocks required for a successful ILM strategy and begin to lay the foundation for efficient management of information throughout its life cycle.

Benefits of ILM

Companies have a variety of storage options for their data. Matching the right storage resource for a particular data set can be based on cost, performance, capacity, as well as other factors. ILM is a process that helps a company manage and streamline those decisions throughout the information life cycle, matching the value of the data to the appropriate storage options at any given time and moving the data as needed.

An ILM solution enables storage administrators to meet the demands of users and develop an efficient storage environment. ILM is a strategic process that leverages the existing infrastructure to maximize a company's storage investments; it delivers cost-effective movement of data that results in better storage utilization and minimizes over-provisioning of storage hardware.

Building an ILM solution

Enterprise storage is a dynamic environment, changing in the short term as different data must be accessed, and evolving in the long term as a company's needs change. Data must be moved around, based on these changes, so that an organization can gain the most productive use from its storage resources. ILM enables administrators to manage storage more effectively by matching data to storage devices, according to the information life cycle, and moving that data around without impacting users.

When implementing an ILM solution, there are two primary user requirements that must be met: 24x7 data access and heterogeneous support.

Delivering 24x7 data access
Much of the promise of ILM is delivered by the automated movement of data throughout an environment based on the value of data at any point in time. An ILM solution that is not able to provide continuous and transparent end-user data access runs the risk of causing significant business disruptions.

Transparency is also key because end users do not want to think about how, where, and when their data is stored; they just want it to be there when they want it.

In the past, users may have been more tolerant of scheduled downtimes for upgrades, maintenance, or data migration, but in today's 24x7 global economy there is no convenient time to bar data access. Any downtime, even if it is scheduled and minimized, could result in lost transactions, customers, and productivity.

Heterogeneous support
ILM should also include support for heterogeneous storage environments. Even if all of an organization's storage subsystems are provided by a single vendor, there are compelling reasons to pursue an ILM strategy that includes heterogeneous support.

Technologies that offer superior price/performance can provide compelling advantages for a portion of a data's life cycle. The emergence of Serial ATA arrays is an example of a technology that can lead to significant cost savings.

With the convergence of storage area network (SAN) and network-attached storage (NAS), and the proliferation of diverse storage technologies, most storage environments are heterogeneous, consisting of numerous components from a variety of vendors.

ILM building blocks

Implementing a successful ILM approach requires deploying several key functions or capabilities that provide value in their own right but also lay the foundation for ILM. Together these building blocks allow administrators to gain an understanding of the company's data and leverage this information to effectively manage the data while maintaining access to the data through each stage of its life cycle.

Transparent data movement
One of the building blocks of ILM, and a key to cost-effective data management, is transparent data movement. Without it, an ILM solution cannot automate data movement and implement policies without impacting users or applications.

Today, most data movement is not transparent. It is a painstaking task that requires data access interruptions, which is a major disruption to users and applications. This presents a challenge to administrators, requiring substantial work to coordinate data migration while minimizing the impact to end users.

Global namespace
Another feature that enables ILM to maintain transparency is a global namespace, which is an abstraction layer that separates the physical location of a file from the logical name of the file that the user recognizes. This abstraction layer provides administrators with the freedom to change the location of files and directories without having to update users' access or notify them of the new location.

Policy engine
Once data movement is transparent, policies can be established to automate the movement of data across storage devices to increase resource efficiency. These ILM policies determine when and where data should be moved. Integration and open communication between management tools—such as system management and storage resource management (SRM) software—are essential so that various factors within the environment can influence data movement. Integration methods should also support emerging management standards such as SMI-S, as well as SNMP and command-line interfaces to allow for custom integration for specific environments.

The ILM policies will automate the system to minimize the need for administrator intervention, making it more efficient and cost-effective. ILM automation can result in reduced hours to manage storage and higher productivity due to redirection of staff resources to other tasks.

Discovery and classification
Policies also require accurate knowledge of the location and status of particular data elements. Most storage administrators and users do not have an adequate handle on the location and profiles of their data files. An effective ILM approach therefore depends on data discovery and classification.

ILM requires the ability to capture metadata relating to files, their location, and usage. Classification capabilities should also allow conceptual grouping of files based on attributes such as projects, applications, or frequency of data use.

The conceptual grouping requires a relational structure that maps between the user's conceptual view and the actual location within the hierarchical file system. In this way, ILM can associate files to each application and user, ensuring appropriate files are accessible when needed.

Evolution of the ILM process

ILM is currently not a product you can buy off the shelf; it is a concept that must be based on the building blocks outlined above.

Even though a full ILM solution has not been rolled out by any single vendor, the building blocks can serve as stand-alone solutions that provide significant benefits. A company can gain advantages right away by starting to implement the building blocks that are available, such as transparent data movement.

Finding the right technology building blocks for ILM will be a critical responsibility for storage administrators. There is already much confusion in the market around ILM. With many vendors poised to launch their ILM strategies, the confusion will continue.

For example, there are solutions on the market that already offer transparent data movement functions, but only within their own proprietary storage environment. This means that a company would need to adopt this single-vendor environment and move all data onto that vendor's systems to gain any benefit.

Backup-and-recovery solutions are also a potential source of confusion. In a sense, backup and recovery is a precursor to ILM. These applications move data between operational and archival storage. However, they do not address access, transparency, or the other benefits derived from managing the information life cycle.

ILM is a process that helps a company manage and streamline decisions throughout the data life cycle, matching the value of the data to the appropriate storage options at any given time and moving data as needed.

Deploying building blocks that provide continuous end-user access and that support heterogeneous environments will deliver value today and will help support a successful ILM strategy for tomorrow.

Jehoshua (Shuki) Bruck is the co-founder and chairman of Rainfinity (www.rainfinity.com) in San Jose, CA.

This article was originally published on March 01, 2004