Logical versus physical data views

Particularly in mixed LAN/NAS/SAN environments, a logical view of data may provide advantages over a physical view.

By Andreas May

As data volumes continue to grow exponentially, advanced storage management solutions have become increasingly important. Not only is capacity management a necessity, but so too is flexible data access. While network-attached storage (NAS) and storage area networks (SANs) are clear options for expanding LAN-based storage architectures, IT managers and integrators should be aware that the traditional storage management approach of linking data to physical resources may not effectively support and protect these new storage infrastructures.

Many LAN-based storage management products can adequately protect devices-albeit a limited number-in a single architecture (LAN, NAS, or SAN), but few enterprises rely on a single architecture for all their storage requirements. Far more typical are environments that encompass multiple storage architectures comprising multiple devices.

In these environments, traditional storage management software faces a challenge: with a separate physical view of each architecture-and each device within that architecture-centralized, unified management over the entire storage environment is difficult, if not impossible. While this may be an acceptable limitation for smaller enterprises, it is not likely to be so for larger ones; the added costs and resources required to manage multiple storage environments do not make economic sense in such complex environments where data access optimization is critical.

Logical vs. physical data views: Logical views of data enable administrators to quickly and easily search, identify, select, and instantly restore data from a central console.
Click here to enlarge image

With NAS, data on the network is re-centralized in a limited number of devices, rather than distributed across numerous servers, desktops, and tape devices. While this effectively minimizes the overhead associated with managing distributed resources, existing storage management software may not be able to leverage NAS capabilities because it is based on an architecture that tracks data physically.

Additionally, the software assumes all storage is directly attached to servers. And products that only offer a physical view of data may not be able to support evolving SANs. To deal with a storage pool that is shared by numerous servers, products that provide a physical view of storage require add-on modules. Yet, even with these add-ons, the SAN storage may not be incorporated into an overall view of the entire storage management environment. The same limitation still holds: the SAN has to be managed distinctly from each NAS device, which is managed separately from LAN-based storage.

To resolve these management challenges, a new storage management approach that views data logically, rather than physically, is re-quired. Products that enable a logical view of data index, track, and manage data based on its attributes, not its location. As a result, enterprise data can be managed as a single resource, across all storage architectures, including LAN, NAS, and SAN.

From a central console, administrators can quickly and easily search, identify, select, and instantly restore data, significantly reducing overhead costs and bandwidth requirements. Equally important, these products can improve data recovery speeds. Faster indexing, fault- tolerant services, and virtualization of shared storage make data access easier and quicker.

While these benefits are significant for any enterprise adding NAS devices to their exist-ing LAN-based storage, they are particularly important in SAN environments. It is virtually impossible to track each file in a large storage pool in a server farm on the basis of its physical location, or even present this information to administrators to enable them to effectively manage it. In fact, this approach can subvert the underlying principle of a SAN: to allow dynamic allocation of storage based on management objectives and to optimize data access.

By viewing data logically-whether it's stored in a SAN, NAS, or LAN-new management concepts can be applied. Data can be moved from one media type to another, for example, on a predetermined schedule or in response to specific events (e.g., when the number of changes to a file exceeds a certain number in a given time interval).

Also, enterprises are able to apply granular security policies based on data types, with tailored permissions granted according to specific parameters. By comparison, a physical view of data re-stricts access to the smallest unit: a tape library, individual tape, or common storage zone, so that anyone with access to that physical entity can access all the data on it.

With a logical view, data can be grouped by type (e.g., application, time of day last accessed, etc.). Then, any management policy-no matter how abstract-can be applied to these data groups. Data created by application X after 4pm, for example, may be automatically moved from the LAN to NAS devices at 5pm for backup before 6pm.

In a SAN environment, this logical approach of viewing data enables enterprises to create point-in-time views of data and then to logically associate different versions of data. This capability may be useful, for example, when assembling all versions of a legal document for a conference among multiple individuals, all of whom need to view all versions simultaneously.

For these reasons, logically viewing data can be an advantageous approach to storage management in complex environments-and may be particularly beneficial in SANs. But not to be overlooked are the benefits of a logical perspective at the data movement level. A file in a typical hard drive is divided into non-sequential blocks and sectors that are tracked logically by the operating system. If data blocks need to be moved, the operating system and management software should be able to present the data logically.

Although block-level extended copy data movement is a significant benefit of SANs, it may not be fully leveraged when data is viewed physically. However, with a logical view of data, only those blocks that have changed in an incremental backup, for example, are moved to tape to minimize backup time, network bandwidth, and resources.

Equally important, this data movement is the key enabling technology for server-less backup. But without the ability to assemble data affected by block-level movement into a logical presentation, it is difficult to track data movement and to fully benefit from this "killer application."

Logically viewing data provides a range of potential benefits to leverage existing and evolving storage architectures. With a logical view, whether data is stored on LANs, NAS, or SANs-or any combination of these architectures-it doesn't affect data management. As a result, resources devoted to storage management tasks are reduced, while data availability and access are optimized.

Andreas May is director of product management at CommVault Systems (www.commvault.com) in Oceanport, NJ.


This article was originally published on May 01, 2000