Information Lifecycle Management - March 2004<br>Myths and Realities

ILM- What's it all about?
Have you ever felt like you've created a monster? One year ago ESG published an article defining Information Lifecycle Management (ILM).The intent of the article was to educate users on ILM, which we defined then as "a combination of technology and methodology that helps users manage data from the moment it is created to the time at which it is no longer needed." From ESG's perspective, this concept has not changed, however we believe there is a difference between information and data, and the distinction is important when discussing enabling solutions. We believe ILM is a process, not a product. Implementing the process requires a number of steps, from assessment to classification of information to the actual automation of the processes. ILM implementations will differ for every organization, and there is no one size fits all "ILM solution".

The intent of this paper is to provide further detail about what ILM truly is and is not, particularly as it relates to the current messaging from the vendor community. We believe implementing (some form of) ILM/DLM processes can help organizations improve resource optimization, address corporate governance and compliance issues, increase performance and reduce costs. Plenty of solutions exist today to help organizations reach those goals; our intent is to help users understand the myths and realities of ILM.

ILM the Next "BIG" Thing
Over the course of the past year ILM has become THE most popular buzzword in the storage industry. Offering "ILM" has been the catalyst for storage companies to step outside of their traditional boundaries and start to market their value (and acquire companies) in the application management, compliance, content and document management markets. The ILM discussion is no longer about just effectively utilizing storage resources and protecting information; instead it is being positioned as a strategic business practice. Future implementations of ILM are requisite for the realization of delivering storage as a utility and realizing the promise of grid/autonomic computing.

All of the hype and messaging around ILM has the end user community very confused, and cautious. It looks complicated and worse yet sounds expensive and disruptive to user's existing environments and processes.

Yet the reality is ILM is a combination of technologies and processes, and most users are doing some form of "ILM" today, even if it is very rudimentary. The value of information will inevetibly change over time, and as it becomes less important to the business, it should be treated accordingly. This could mean certain information is placed on lower cost arrays, sent to tape, archived in a warehouse and/or eventually destroyed. ILM is the process of first making those value determinations, and then setting up protection, movement and retention policies based on those relative valuations. "ILM solutions" should then help implement, and automate those policies.

ILM vs. DLM- Is there a Difference.
To make matters a bit more confusing, a number of vendors believe the term ILM is a misnomer. The argument is that although most vendors paint a vision of Information Lifecycle Management, most of the solutions available today primarily address Data Lifecycle Management (DLM). This may seem like an argument in semantics, however at ESG we believe it is an important distinction. Many line of business administrators and application vendors do perceive a difference between information and data. Applications leverage information, whereas storage systems store data. Webster's dictionary of course confirms that there is a difference between information and data; information is the communication or reception of intelligence, data is the digital representation of information. By using the term ILM so broadly, storage vendors (and analysts….) could be confusing the customers by blurring the line between information and data, making it more difficult for customers to understand where "ILM" solutions fit in their business processes.

We believe that illustrating the relationship between ILM and DLM and how these processes address business requirements will help users better understand how implementing ILM/DLM will help them move to an automated, on-demand environment.

The long term goal of the "utility data center" (or autonomic/on-demand/grid computing) is to provide a fully automated IT infrastructure that will be able to provide compute, storage and network resources to applications on the fly. This concept is not some utopian dream; technological innovation is bringing us closer to this realization each day. However a great deal of "groundwork" must be done before the utility can become a reality. By effectively implementing and automating ILM processes, users will lay the groundwork for the utility vision.

However, this is not a one step process. Users should take a layered approach; automating processes at each level of their IT infrastructure (storage, server and application layers). ILM vision presentations promise the solutions that have the ability to automatically understand the relationship between applications and their associated information sets, such that the ILM solution can automatically assign valuations to that information. Once those valuations are set, policies will then be enacted that migrate, protect, retain and eventually discard that information according to business requirements.

The reality is there are no solutions today that can understand the application/information relationship and automatically set valuations that determine where the data should reside. This is primarily a manual process aided by reporting solutions. Today, most all of the solutions that are being pitched as ILM solutions are actually focused on data migration, retention and protection, which is truly Data Lifecycle Management.

Don't get us wrong- we believe in the long term promise and vision of ILM. However, ESG believes that the majority of solutions available today are truly DLM solutions. As these solutions evolve, they will become more aware of the application/information/data association, and be able to automatically set classifications and migrate data according to the real time requirements of database, content and retention management applications. This is the future promise of ILM, but the industry is far from that realization.

The ILM Process
With all of that said, ILM has in a short period become the common descriptor for the process of managing data throughout its lifecycle. While DLM is required to implement ILM, ILM is the umbrella term. The entire process of ILM involves multiple steps as we outlined in our initial paper. Those steps are:

  • Assessment
  • Socialization
  • Classification
  • Automation
  • Review

Currently, the first three steps, while aided by storage resource management solutions, are primarily manual. The later steps of automating the processes are implemented using DLM solutions.

Regardless ILM/DLM is really about changing the way an organization thinks about their information/data assets, and changing the way they store those assets. End users should look for solutions that enable them to effectively assess and categorize both the storage and information assets within their environment, set values on information sets, set and enforce policies according to those values and migrate and protect data automatically according to those values.

Value based Lifecycle Management
At ESG we believe that today the ILM/DLM discussion is about migrating information/data across the storage infrastructure for three key purposes:

  • resource optimization
  • effective data protection
  • ensuring application performance

ILM can mean many things to many users. There is no single definition of an ILM solution. To one user it can simply mean archiving for compliance reasons, for another it is migrating aged data from higher cost Fibre Channel SCSI based arrays to ATA-based arrays. A number of business drivers and information characteristics will drive companies to implement varying ILM processes.

These characteristics and their relative importance vary from business to business, and even within different departments of the same company. A brief list of these characteristics and considerations includes:

  • Retention cycle - Does the information need to be retained for a specific period for a corporate governance or regulatory purpose?
  • Disposition cycle - Once the retention cycle is complete, should the information be disposed of completely archived to a lower-cost media ? Does the information need to be electronically shredded after the retention cycle is expired?
  • Archival cycle - Does the information need to be archived for long periods? If so, does this archival need to be stored separately from the original?
  • Access frequency - How frequently or infrequently is the information accessed once created? Will it be "Write once / Read many" or "Write once / Read rarely" or will it have a more active access frequency?
  • Read / Write performance cycle - Based on the access frequency of the data what is the required performance for both read and write operations? What technologies are appropriate for these requirements?
  • Read / Write permissions - Does the information need to be stored on non-erasable, non-rewritable media?
  • Recovery performance cycle - How quickly does the information need to be recovered?
  • Security Issues - How will the compromise of this information at different points in its lifecycle effect the business?

Ultimately, by applying certain filters and definitions to an organization's repositories of information it is possible to assign relative "values" to this information. The value of a given piece of information is not a one-dimensional metric, but the product of analyzing a variety of interrelated data points. Assessing information according to the above criteria, and setting lifecycle policies according to those criteria are crucial aspects of the ILM process. Only when these are complete can organizations then move ahead to implementing automated data migration, protection and retention schemas as part of their ILM process.

What "ILM" Solutions Are Available Today?
Again, ILM is a process, not a product. One cannot buy ILM solutions, only solutions that enable ILM processes. Vendors may choose to refer to their products as ILM or DLM solutions, but the reality is these products can only address a portion of the overall process. It is important to realize that an ILM process does not have to be all encompassing, addressing the lifecycle of information associated with every application in the organization. The solutions that organizations will use to implement their ILM processes will differ depending on their environment and data retention and protection requirements. The point is, just as there no one size fits all ILM process, there is no one size fits all ILM solution.

The five step process that we outlined above will require various solutions to address the requirements in each phase:

  • Assess current Storage resource utilization
  • Determine type/application association of information stored on resources
  • Determine value of information according to business requirements
  • Set policies that determine where data should reside throughout it's lifecycle
  • Set policies that determine how data should be secured and protected throughout its lifecycle

Again, the actual process of determining which data is most important to the business a manual task. Users will set the policies that reflect the valuations, and the implementation of those policies can be automated. Solutions do exist today that can aid in this process. A partial list of vendors that provide solutions to address these tasks today includes: CA, EMC, HP, HDS, IBM, VERITAS Arkivio, Commvault, Softek

Professional services and security considerations are extremely important in the overall ILM process. Professional service organizations can help users assess their environment and more importantly help them understand the relationship between the information and applications. (ESG highly recommends using professional services and or consulting services during the initial ILM phases).

Automation: Multiple processes could be automated to provide resource optimization, data protection and enhanced application performance. A few of those processes are listed here.

  • Migration of data onto tiered resources
  • Automate the remote replication of data
  • Automated database archiving
  • Automated email archiving
  • Automated retention management

A partial list of vendors that provide solutions to address these tasks today: CA, EMC, HP, HDS, IBM, NetApp, STK, VERITAS, Arkivio, Commvault, FalconStor ,Invio, KVS, KOM Outerbay, Princeton Softech, Signiant, Softek, Zantaz and more

Again, security is a major concern when assessing and implementing software to automate processes; only authorized personnel should be able to change policies. Of course if an assessment uncovers that certain information is critical to the business, that information should be adequately secured. Solutions from companies like Decru and Kasten Chase can address these security concerns by automatically encrypting data.

Note: ESG will continue to publish ILM focused papers, look for future articles which drill down into varying ILM processes and the associated solutions that enable those processes.

The Bottom Line
Despite all the hype and confusion, ESG believes implementing ILM processes is both beneficial and necessary. Organizations will lower their overall administrative and operational TCO while assuring that administrative and operational information is adequately protected and resources are efficiently utilized. In addition, implementing effective ILM processes will bring organizations closer to being able to realize the promise of utility computing.

ESG does not believe implementing end to end ILM processes is an easy task, nor do we believe this can be accomplished today. However, organizations can begin to build ILM processes into their business practices, As long as the business managers and IT administrators work closely together to determine how the effective management of information assets will meet business requirements, organizations can begin to reap benefits from automating ILM processes today. There are plenty of "ILM enabling" solutions on the market today; we suggest user overcome any reluctance they may have to move towards "ILM" and put these solutions to good use today.

Authors: Nancy Marrone-Hurley, Peter A Gerr, Steve Kenniston

This article was originally published on March 10, 2004