It's all about putting data at the right place, at the right time.
By Bill Terrell
Information life-cycle management (ILM), also referred to as data life-cycle management (DLM), is one of the most promising, loosely defined, and hyped terms of the year. Software and hardware vendors are using the term to describe both old and new products, with the result that many users are unsure what's so new. Despite all the excitement around ILM, the uncertainty around definitions and applications is causing confusion within the end-user community—similar to the confusion over the term "virtualization" a few years ago.
Looking past the marketing hype, ILM has the potential to help IT administrators better manage their data and storage. As new regulatory requirements and tight budgets increase the pressures on administrators, a renewed need to optimize storage resources and leverage existing infrastructure has emerged. The result has been increasingly disparate storage devices, a proliferation in the number and types of tools needed to manage those resources, and an increase in the difficulty of storage management. Users are demanding that vendors address these issues and relieve the storage management headache. ILM products, if delivered as promised, may be the needed cure.
Moving data through its life cycle
ILM involves moving data from place to place according to its usefulness. In other words, ILM uses the right mix of hardware resources to optimize performance and protection levels for different kinds of data at different stages of its life cycle or relevancy to a company. By putting data where it best belongs—crucial databases and data on the fastest arrays, important data on the most reliable storage devices, and the least important and/or least accessed data on inexpensive storage subsystems—and eventually moving obsolete data to tape, companies can make the optimum use of their storage infrastructure.
Essentially, ILM puts data where it makes the most sense depending on how it is used, how often it's accessed, performance and reliability requirements, and organizational needs. ILM is also policy-driven; that is, data is automatically migrated based on business needs and requirements, according to policies set by administrators, eliminating manual movement of data.
Data's value differs by industry, age, and relevance. There is a clear difference in the value of data depending on how long it has been stored, but this may differ dramatically by industry. For example, in the medical information field, the value of patient records remains fairly constant over time. A physician generally wants access to a patient's entire medical history, not just the most recent checkup record. Therefore, medical data may need to be stored on the most fault-tolerant storage and will need to be readily accessible—throughout its life cycle. In addition, there are regulatory requirements regarding patient data and how the data needs to be stored and backed up.
However, the value of information can be dramatically different for other industries. For example, an engineering organization has a very different data usage profile than the medical field. The value of engineering data changes depending on the product development cycle. An engineering group designing a new chip needs access to a current image of any design. Thus, the value of the working data is extremely high. However, that group might also have many temporary files such as simulations and test runs that have high immediate value but are of no use the next day or the next week. In contrast to medical information, the value of engineering data decreases with time, with stored files from three or six months ago having almost no use.
ILM focuses on appropriately moving data through tiered storage devices, depending on its value. An example is migrating file data as it ages from high-speed fault-tolerant arrays to slower, less-expensive ATA disk arrays, and eventually to tape when the data no longer needs to be accessed as frequently. The tools that execute the policy of movement need to be responsive to the retention requirements for that data and what should and should not be migrated. Ideally, ILM should also be very granular, allowing not only entire files to be moved, but also portions of files and parts of databases across a network.
Riding the hype curve
ILM is currently characterized by excessive hype, with a variety of vendors plugging different aspects of the ILM puzzle. Some vendors are dusting off old hierarchical storage management (HSM) products, claiming they are now ILM products. The result is that not everyone agrees on what ILM is—or isn't. One area of confusion is whether ILM is in fact HSM. In reality, HSM is only one component of an overall ILM strategy. Similarly, some vendors have re-labeled their storage area network (SAN) management packages as ILM software. SAN management is also a crucial part of ILM, but not the entire solution.
It's fairly confusing as the market attempts to define ILM; however, several major points seem to be emerging. It seems clear that ILM
- Is network-centric and relies on consolidated and networked storage;
- Encompasses both file-based and block-based data;
- Includes both hardware and software components; and
- Has a strong automation and policy component.
One of the common themes in the ILM movement is a network-centric focus, because ILM relies in part on the benefits of a networked storage environment. ILM solutions, for the most part, also rely on having a consolidated storage environment, such as a SAN with multiple classes of hardware resources. It also assumes use of techniques such as virtualization and storage pooling, disk-based backup, automated migration to tape, and other network-based storage management technologies.
ILM tools address more than file-based migration (which HSM tools have addressed for years); they also address block-based data storage. Some ILM tools are very granular and are able to move and migrate not just single files, but also small portions of files and specific non-file database tables or even single rows from these tables. This goes well beyond the traditional file-based solutions.
It's also clear that ILM is not just a software movement, but also involves hardware. Specialized data movers and hardware are necessary to make ILM a reality, both to provide the high-performance data migration and movement required for ILM policies to work, as well as to provide the information and statistics required for ILM to monitor data access patterns.
Finally, ILM has a strong automation and policy component. ILM involves the automation of data movement based on policies set at many levels. The policies automate the decision-making process that administrators currently deal with manually. Policy, which exists at all levels of software, means that administrators broadly define their goals and outline their requirements within an ILM solution. For example, policy may specify the exact number of days that data must remain accessible; what kind of data requires the highest performance storage; and how well-protected specific kinds of data need to be.
The promise of ILM
Despite the hype and confusion, ILM promises to help enterprises streamline and optimize their use of hardware. The pressures on IT administrators are not going to abate soon, and the techniques and concepts behind ILM will help alleviate the pressures. Vendors clearly see a need to develop the right technology for delivering on the ILM promise—automated and appropriate placement of data depending on its importance to users, at a high level of granularity. However, most ILM techniques and tools are still in the infancy stage.
Although there is a lot of confusion in the market as vendors try to define their role in the ILM ecosystem, a focus on solutions is helping to clarify what users really want. Expect an evolution of ILM solutions as vendors determine users' true needs and turn the promise of ILM into reality.
Bill Terrell is the chief executive officer and co-founder of Troika Networks (www.troikanetworks.com) in Westlake Village, CA.