BY NOEMI GREYZDORF
Faced with tighter budgets and sparser resources, waste is not an option for most storage administrators. Yet aligning appropriate storage resources with application demands is still a challenge. Storage tiering has been available for some time, and yet it still means different things to different people. There are two ways to apply storage tiers:
- Define classes of applications based on parameters, such as performance, and align each class with a tier of storage that delivers on those parameters.
- Identify data whose access pattern has changed, yet which is still consuming high-end expensive storage resources, and migrate the data to a lower tier of storage.
Over the next five years, data will continue to grow at more than 50% per year. A number of drivers are affecting this growth:
- The increased rate of digitizing records and content;
- Collaboration and communication across organizations, both within enterprises and across partners, suppliers, and customers;
- Regulatory compliance, governance, litigation support, and best practices are driving longer retention of data;
- Perpetuation of data for data protection and disaster recovery.
As a result, storage managers are trying to match application requirements with storage resources. "One size fits all" is no longer good enough in the majority of data centers. Business units demand the characteristics that enable growth, productivity, and competitiveness. Additionally, storage managers have been asked to reduce the overall cost of storage and its impact on the data center.
NOEMI GREYZDORE, IDC
Mapping resources to applications
To achieve the goal of better aligning storage resources with the demands of the applications, storage managers seek a tier-based strategy where appropriate storage resources can be mapped to applications. There are two situations in which storage tiers can be applied, and they are not necessarily exclusive of each other:
- Aligning the demands of an application with an appropriate storage tier. Each organization must define its own parameters for each tier of storage. Parameters may include IOPS, redundancy, throughput, reliability, data protection, recoverability, and support.
- In many situations, when data ages, access patterns change and so do the parameters for storage. Unfortunately, this data has been created by an application and stored in one location, and moving it to another tier of storage is not as simple as it may seem.
If the application requires performance, allocating slower storage resources would negatively impact productivity and may result in revenue loss. Conversely, allocating high-end storage to an application that doesn't have stringent performance requirements may result in higher than necessary costs for the storage.
Aligning application demands with storage tiers starts with defining the requirements of the application. These requirements may include performance, reliability, recoverability, availability, and other parameters. Once a class has been designated to each application, then storage tiers can be created for each class. Storage tiers may vary based on access (FC, iSCSI, NFS), drive type (FC, SAS, SATA), reliability (RAID 1, 10, 5, 6), availability and recoverability (multi-pathing, redundant controllers), and others.
The final step is simple: Determine which class of application belongs on which tier of storage. When a new application is brought online, first determine its class of application; assigning a storage tier is simple afterwards.
Tiers within a tier
Suppose it has been determined that application G is of a class that requires Tier-1 storage; Tier-1 storage is defined as having 15,000rpm FC drives in a RAID-10 configuration with dual controllers and redundant paths to the host. Application G has been creating and storing data on Tier-1 storage for a while, and a significant percentage of the data created is not being accessed very much or at all. It turns out that 120 days after data is created, it is rarely if ever accessed, but it can't be deleted.
This is not an uncommon scenario. The result is that a large percentage of data, often referred to as static but valuable, consumes expensive storage resources. The challenge is enabling storage tiers within a tier.
There are a number of approaches to achieving tiers within a tier:
- Hierarchical storage management (HSM) software moves files to different storage tiers based on age or last-time-accessed parameters. To do that without impacting users or applications, a stub file is left behind to redirect the requester to the new location of the files.
- Storage tiers can be achieved within a volume through block-level storage virtualization. In this scenario, a single volume consists of multiple storage tiers, and blocks are migrated across those tiers based on access patterns.
- Storage tiers are managed by a single file system. In this scenario, an intelligent file system can write to multiple storage pools, each pool being a tier. The file system migrates files across tiers without impacting users or applications. No stub files are required.
In all three cases, tiers within a tier can be created for more granular alignment of applications and storage resources.
There are two main wins with storage tiers. First, more appropriate storage tiers keep applications humming. Second, more appropriate storage reduces costs. Assume that 60% of data is no longer being accessed regularly, and that lower-tier storage costs 25% of primary storage. Moving more static data to lower tier storage can reduce storage costs by 15%.
Not all tiering is the same
Defining application workload tiers and creating storage tiers for efficiency have both financial and productivity implications. Unfortunately, not all tiering approaches are the same. Some approaches may negatively impact other aspects of the environment.
Storage tiering can be achieved at the block or file level. At the block level, there is no impact to applications or users, but it requires an advanced level of virtualization where blocks are moved across storage tiers seamlessly.
File system-level tiering can be achieved in two ways: leave a stub behind or move the file to another system completely.
File system-level tiering may present some challenges in other areas of the infrastructure:
- If a file is moved to a completely different system that is a lower tier, users and applications experience changes. For example, users or applications must be remapped to the data's new location. This can be achieved through a separate interface or as a different drive. Once data is moved to another system, that system becomes the primary keeper of that data. Every time new data is moved to this system, the system must be backed up; if there is a failure, deletion or corruption, the data can be restored. The challenge is adding a new job to the backup system and modifying how applications and users access the data.
- If a stub is left on the primary file-based storage system to redirect users or applications to the new physical location of the file, then there are three main considerations:
- A typical stub file is 4KB; if the file being replaced with a stub is not bigger than 4KB, there are no capacity efficiencies gained.
- The stub file may be significantly smaller than the file being moved to another tier. This may have an unexpectedly negative impact on the backup and restore system. If there are many files in the file system, tiering may have no impact at all. Backing up small files takes a lot more time then backing up large files.
- When you perform a backup of the primary file-based storage system, the backup is of a stub, not the actual data. To ensure data availability and recoverability, a backup of the file-based system where the actual data now resides must be performed. It is also important to keep a metadata view of the system at the time of backup so if a restore is required, it is known where the actual data resides.
From a storage utilization and optimization perspective, storage tiering may seem to be a perfect solution; however, beware of how the tiering strategy impacts the storage infrastructure as a whole.
A must-have solution
The push to deliver greater efficiencies has sparked innovation in the storage tiering market. Delivering storage tiers at the block level remains challenging, but there have been some breakthroughs at the file system level. A number of file system vendors are designing tiering capabilities that allow data to be moved across classes of block storage without impacting applications, users, or the broader storage infrastructure environment.
Over the next few years, there will be more solutions that deliver tiering capabilities; in time, this will be a required capability, not a "nice to have" feature. End users today are asking for storage tiering in general, and the recent shift towards efficiency is bound to raise the demand for more cost effective ways to store file-based data without creating additional complexity.
Storage vendors would benefit long term from developing seamless storage tiering functionality into their file-based systems. End users would benefit from asking the right questions and evaluating each solution, not only on its immediate impact on the storage system but also on the impact on the overall storage infrastructure.
NOEMI GREYZDORF is a research manager at IDC (www.idc.com).