Start your metadata engines!

In the lingua franca of storage area networking, server-less backup is the "killer app," but coordinated, concurrent data sharing is the "Holy Grail."

By John Webster

Data sharing was a popular discussion topic a few years ago, but it lost cachet when it became all-too-obvious that there were monumental barriers to overcome to accomplish data sharing among heterogeneous hosts, switches, and storage subsystems. However, the current maturation and acceptance of SAN technology gives us a previously missing framework for open, heterogeneous data sharing. SANs are here and now, and it's time to put the data-sharing goal back into circulation.

In addition to the SAN frame-work, getting to heterogeneous data sharing will require the creation of SAN file systems that are accessible by all applications, regardless of what host or operating system they run on. SAN file systems define how data is stored, and they contain the rules by which any host can access, retrieve, and manipulate the data stored on any SAN-attached device. As such, SAN file systems are critical enablers of heterogeneous data sharing.

The most common and well-developed implementation, the metadata appliance locates the metadata engine in a dedicated server attached to the SAN and the existing IP network.
Click here to enlarge image

In this first installment of a series of articles, we look at SAN file-system metadata and the engines that use metadata to control application data access.

Data about data

As a first step to sharing data among multiple hosts in a SAN environment, logical unit number (LUN) masking techniques have been used to create logical partitions within networked storage pools.

LUN masking allows applications sharing resources on the SAN to "see" only the disk volumes, file systems, and files assigned to them. This supports sharing a physical infrastructure and management among various hosts and applications, but does not allow for simultaneous multi-host access to files and the data within them.

To accomplish this, we need a more sophisticated facility for locking files, records, or data blocks when they are being accessed by an application, and afterwards unlocking them when they become free. Metadata engines answer the call, underpinning the creation of a SAN-wide file system.

In the cluster file-system approach, a virtual "metadata appliance" can be embedded into an operating system's clustering technology.
Click here to enlarge image

Metadata engines are aware of two types of data: actual user or application data and information that describes the structure and state of the file system at any given point. The information describing the file system-data about data, if you will-is called "metadata." While not all metadata engines are equal, they generally create and maintain:

  • An inventory of all stored objects-files, databases, digitized images, etc.-that are to be made visible to a heterogeneous set of users and applications.
  • A set of interrelationships between the hosts, users, applications, and stored objects. These four relationships include both security information and concur rency control information (primarily locks).
  • A repository for policy information used to control placement of files under control of the metadata engine.

In current practice, metadata engines also make a distinction between the protocols used to transfer data (SCSI command set, usually running over a Fibre Channel physical layer) and metadata (usually running over a separate network, such as a TCP/IP LAN). Both data and control traffic could flow over the Fibre Channel SAN infrastructure, but most implementations split the data path from the metadata-based control path.

Four engine models

Conceptually, each of the four methods of implementing metadata engines is at a different stage of development.

Metadata appliances

To accomplish SAN data sharing, the metadata engine implemented in a metadata appliance receives I/O requests to open a file from an application, permits access to the file while temporarily locking out others, and returns the file to an unlocked status when the I/O completes. All communication between the host application and the SAN regarding file access is typically passed over the IP network, while the actual data requested by the applications moves over the SAN.

Appliance-based metadata engines function as active participants in the SAN. However, they do not actually service any I/Os. Rather, they manage the path of an I/O from application to storage devices and back again through the SAN infrastructure.

Current incarnations include Tivoli's SANergy version of the metadata appliance model. SANergy was released originally by Mercury Computer Systems in 1998 for SAN-based data sharing in streaming environments.

Cluster file systems

Though most "clusters" historically found in Unix and Windows environments are really just availability boosters based on simple "failover" techniques, more sophisticated "concurrent sharing" implementations have emerged. Coupled with back-end SAN storage, cluster file systems (CFS) create a true SAN file system.

Left: A smart switch combines a SAN fabric switch with a metadata engine. Right: The fully distributed approach places multiple instances or components of the metadata engine around the SAN fabric, either on multiple servers or even within infrastructure devices such as host bus adapters and intelligent array controllers.
Click here to enlarge image

Examples include Compaq's Tru-Cluster CFS and SGI's Irix CXFS. The key limitation is that because clusters are almost always homogeneous, the any-host-can-access-any-data generality associated with SAN technology is often unavailable. Veritas' emerging SANPoint product line will, however, extend the CFS approach on a heterogeneous basis.

Smart switches

Smart switches use the SAN fabric transfer data and communicate metadata control messages. Locking and coordination mechanisms are implemented by the SAN fabric itself, typically under the con-trol of a dedicated processor within, or attached directly to, the switch. In addition, the metadata engine may include a large local memory that can be used as an onboard cache, allowing the switch-attached processor to service some I/Os directly.

Gadzoox Networks (Axxess) and DataCore Software (SANsymphony) have teamed up to introduce the first intelligent switch.

Distributed option

The application of a truly distributed processing model to SAN-based metadata remains in its infancy. In this approach, metadata, including the awareness of locking mechanisms, is distributed across multiple SAN components. This can be as basic as spreading the metadata coordination across multiple cluster hosts in a CFS configuration, or as radical as distributing coordination responsibilities and intelligence across application hosts, controllers, switches, and storage subsystems.

The distributed CFS approach will likely be available in the next generation of CFS products, but the everything- distributed model is more distant. Many decisions have yet to be made about what protocols will be used or how cache could be implemented. The promise of the distributed model lies in its potential to be the most scalable and most recoverable of the various implementations. However, it is also by far the most complex model to implement.

Tricord Systems is developing an implementation of the distributed model, but hasn't made any product announcements yet.

Once around the track

Using metadata engines to manage I/O makes the LUN masking and zoning techniques common in most of today's SAN implementations appear quite primitive. Yet, it has made good sense to have some simple, practical results to show along the way to SAN's Holy Grail. Some metadata engines and SAN file systems have now inched their way into production, but most of the emerging players have yet to even make it to the test track.

The winners in SAN data sharing will be those who make the most efficient use of their metadata engines. The models they use must exhibit low latency, high performance, and high resilience. Ladies and gentlemen, start your engines!

John Webster is a senior analyst with Illuminata (www.illuminata.com), a research and consulting firm in Nashua, NH.

Click here to enlarge image

John Webster
Senior Analyst,

This article was originally published on November 01, 2000