Entropy is a scientific concept which essentially states that, if left alone, all organized systems will progress towards a state of disorganization. It means that, basically, everything eventually falls apart – unless it’s periodically maintained. Any homeowner can see this principle in action, when they neglect to paint the house or weed the garden. IT should also relate to this concept.
In the IT environment, entropy could be defined as the degradation of a system due to the constant change it sees (changes in workloads, resources, administration, etc), without the corresponding maintenance and upgrades required to keep it in tune. For the storage manager, this shows up as the quiet deterioration in performance or available storage capacity that seems to occur over time. This could be capacity that’s not consumed but is still unavailable, due to the day-to-day cycle of allocating, modifying, copying and deleting data. This wasted space could also be the result of inefficient provisioning processes, or just duplicate data objects that aren’t deleted.
Storage entropy isn’t so much the result of poor technology, but more a function of human nature and our environment. We forget things, we have too much to do and, given the tools we have to work with, it’s difficult to keep track of complex, virtual resources like storage; especially with turnovers in personnel and a slower cycle of hardware replacement. Server virtualization can add to this problem by abstracting physical resources while generally increasing the number of server instances consuming those resources. So, while data growth is driving storage purchases, entropy is making things worse.
What should IT do about this? Maybe nothing; after all, you don’t want to spend $10 to save $9 or even $11. Historically, the incremental cost (especially the work involved) of saving a TB of storage has often been greater than the cost of buying new capacity. But factors such as power, floor space and current economics may have spawned a change in attitude, and with new technologies available, spending that $10 could end up saving $50 or $100. This systemic disorganization that defines entropy can come from a number of sources, and the methods available to address these causes are likewise varied. However, the strategies employed must be evaluated against the potential savings they can produce to see what kind of Return On Effort is likely.
Storage reclamation
Enterprise array management software has tools that can be used to identify wasted space. Some of this waste is just the result of inefficiency, as capacity is allocated to servers, returned to the storage pool and reallocated. Over time, this pool can become somewhat fragmented, especially if multiple storage systems are in use. Storage virtualization, at both the block and file level, can help consolidate the existing pool of usable storage into larger volumes to use with new applications.
Good storage resource management (SRM) tools can also identify storage that’s allocated but not used. These orphaned files or volumes could be assigned to servers or VMs that are no longer in use but were never returned to the storage pool. Capacity could also be mapped to an invalid HBA or port WWN and reserved for a future project that never materialized. In a perfect world these allocations would be recorded, but people get busy. They also leave the company. Turnover, mergers and acquisitions are another source of storage entropy, as storage systems come into the environment as a result of this consolidation. When some of the “old” team doesn’t accompany the new storage assets, there’s a good chance orphaned data exists on these systems.
Some independent SRM tools can analyze the relationships between hosts and disk groups, LUNs, clones, mirrors, snaps, etc. to identify mapping or masking errors and other conditions that cause files to be lost. They can also highlight when changes are made in the environment, which can cause lost capacity. Since they’re vendor-independent monitoring solutions, they can perform across the entire environment, not just within a single array or arrays from one manufacturer. Like rust, storage entropy never sleeps, so tools that specialize in identifying conditions like this should be run as part of the regular preventative maintenance schedule.
Virtual servers
Compared with physical servers, a virtual environment can make storage entropy worse. With virtual servers there are no boxes, no support contracts, no physical reminders of their existence, or their storage consumption. Even when VMs are set up, documented and properly managed, they can still cause storage entropy. Through the normal cycle of allocation, expansion and decommissioning, VM storage resources can become out of synch with current requirements. Resources that support a virtual server environment must be balanced regularly to maintain cost-efficient operation and stay optimized.
There are tools available from server virtualization platform vendors, as well as independent monitoring tools that can help in this resource balancing process. Some even enable a VM to be managed through its lifecycle, while optimizing resources along the way. Like the other SRM tools, virtual infrastructure management tools that are platform independent can offer unique functionality.
Thin provisioning
In addition to reclamation tools that enable wasted capacity to be identified, some arrays and separate appliances also include thin provisioning. Thin provisioning was originally developed as a solution to the over-allocation that resulted from an OS’s or application’s inability to handle volume expansion easily. Rather than wrestle with the downtime and complexity common with expanding a database or file system’s capacity, they would be set up initially with enough storage to support expected future growth. This saved time and assured availability of storage, but resulted in a significant amount of “white space,” or unused capacity.
Thin provisioning alleviated this problem by allowing capacity to be over-subscribed to these applications, but not actually allocated by the storage system. Kind of like the way a bank has more deposits on the books than actual cash on hand, thin provisioning enables a storage system to support more applications than it has the physical storage capacity for at any given time. The trick is to keep real capacity ahead of actual usage by the applications and file systems. Now available from a number of vendors, thin provisioning provides a way to keep one type of storage entropy in check, but only on newly created volumes. Data that is migrated in from storage that’s not “thin” presents a potential problem.
Historically, when files were deleted, the file system simply marked them as “available to be overwritten.” The storage array had no awareness of this condition and still considered this capacity as “used.” There was no way to reclaim this wasted space. So, as data sets are created, modified and deleted, entropy can set in, making thin volumes “chubby” over time. Also, when “thick” volumes are migrated into a thin-aware storage system, they’re no longer thin.
New developments in thin provisioning have addressed this issue. When files are deleted, a utility can be run that essentially writes zeros to the bits in those deleted files. Then, when a volume is copied into an array with a “zero block detection” capability, the zero blocks are stripped out and a thin volume is created. For many environments, this requires extra maintenance steps, but the effort can capture wasted space and save money.
As an improvement on this technology, thin-aware file systems can essentially zero out the blocks from deleted files automatically, and then communicate with the storage array to identify these blocks for reclamation. Using a common API, the automated process results in a method to keep thin volumes thin and reduce storage entropy, without generating more work for storage managers. While a number of storage vendors support zero block detection, only a small subset has developed the APIs to support a thin-aware file system.
Different storage vendors use different block sizes, and smaller blocks can mean more effective zero-block detection and more space savings. Also, the processing required to perform these XOR functions can affect performance, depending on where it’s carried out. Some vendors do this in software, putting the load on the array CPU; others use ASICs dedicated for this purpose.
Data deduplication
Another way entropy creeps into storage is by the creation of duplicate files and data sets. Backups can be the cause of this, as can other data protection processes, such as disaster recovery. Virtual machine images can contain a significant amount of duplicate data, since they’re usually created from the same group of templates. User applications, such as MS Office, also generate files that are very similar in content to files created by other users.
Data deduplication can certainly reduce the amount of duplicate data in the backup and DR processes, and keep these data sets that way with subsequent backups. But deduplication technologies are now also getting put into non-backup applications, such as archives and secondary storage tiers, helping to alleviate this wasted capacity.
Virtualization systems
Identifying and locating these orphaned files or pockets of unusable storage is the first step, but it must also be collected before it can be reused. Proprietary arrays typically have volume management utilities and other tools to handle this second part of the process. But if multiple vendors’ storage platforms are used, or if additional functionality (such as thin provisioning or deduplication) is needed, a block- or file-based storage virtualization appliance may be the answer. These “storage-agnostic” solutions can reduce entropy, while they facilitate the consolidation of reclaimed capacity across platforms.
Entropy is all too common in IT. In storage systems it can show up in the form of reduced usable capacity and performance over time, as systems are added, reconfigured and decommissioned, to meet the dynamics of day-to-day operation. Inefficient storage allocation, reserved capacity that’s lost or forgotten and duplication of data objects can accumulate, even in the most organized environments. Storage arrays have tools that can help curb this waste, but often, other methods must be employed. Technologies such as thin provisioning, data deduplication and advanced storage infrastructure management solutions can help address the problem. In the end, storage entropy will still occur, but the proper tools and a little preventative maintenance should help keep things from going to pieces.