File systems deliver on the promise of SANs

SAN file systems can help you manage and share existing storage resources and better equip you to handle future growth.

By Paul Rutherford

Many IT managers today find themselves struggling to efficiently manage "islands of storage."

A solution to the management problem—one that enables IT managers to integrate these islands and thus realize the true promise of storage area networks (SANs)—may lie with SAN file systems. So, what is a SAN file system and how does it compare to other technologies, such as network-attached storage (NAS) or SAN virtualization, in bridging this gap?

What is a SAN file system?

The reality is that not much has changed since the earliest operating systems. Applications and users need to organize, access, and share data on mass storage. Since tracking disk sectors is clearly not an efficient use of human skills, early operating systems abstracted disk storage into directories and files through the use of file systems.

File systems also control access to files, so that data integrity is maintained. All operating systems provide a file system. In fact, most provide several file systems. There are local file systems; network file systems that allow users to share files across the network (NAS); and special-purpose file systems such as those designed for SANs. Operating systems allow these different types of file systems to co-exist as peers.

A SAN file system exploits storage networks to allow multiple servers from different vendors to share common storage with a single file system at wire speeds. It is similar to NAS in its ability to let heterogeneous systems share data, but it is different in that there is no computer between the servers and the disk.

The SAN file system consists of two primary components. First, each server on the SAN contains a portion of the SAN file system that is specifically designed for its operating system. This component is often referred to as a "client." The activity of the clients must be coordinated since they are sharing a single storage resource. This responsibility falls to the second component, the metadata server, which is the traffic controller and resource allocator for the system.

Clients go to the metadata server for operations that must be coordinated across the SAN (e.g., creating or writing a file). When it comes to actually accessing data, the clients are independent of the metadata server so that they can take full advantage of their own I/O performance. It is this independence and the direct connection over the SAN that allows SAN file systems to provide wire-speed performance.

Today's SAN file systems support a wide range of operating systems, which allows data in a single SAN pool to be shared among multiple operating systems. This removes the dependence between the data and server platform. Heterogeneous file sharing is vital because a component of the file system resides on every server and because most enterprises run more than one server platform.

Why implement a SAN file system?

Because they allow heterogeneous SAN file sharing and data consolidation, SAN file systems can help IT managers realize the efficiency and cost benefits of SANs by

  • Eliminating redundant copies of data;
  • Making it possible to share data among heterogeneous platforms;
  • Maintaining data availability, even if the originating computer is unavailable;
  • Improving system manageability;
  • Allowing independent scaling of network components;
  • Eliminating the need to choose between shared access and performance; and
  • Improving higher utilization rates for storage resources.

With only one copy of the data to maintain and centralized disk storage, data management is greatly simplified. The distributed structure of SAN file systems allows users to independently add storage, bandwidth, and servers. (In fact, the type and manufacturer of the servers can change daily to meet business needs.) Utilization is optimized since there is a single pool of storage from which all access occurs.

In short, SAN file systems can help you manage more data on existing storage, as well as better equip you to handle future storage growth.

Virtualization (left) allows servers to share the same physical storage device, but does not allow servers to access/use each other's data or free space. A SAN file system (right) allows servers to concurrently share and use each other's data without provisioning.
Click here to enlarge image


SAN file systems in action

The earliest SAN adopters were primarily users looking to migrate from direct-attached storage (DAS). These users were part of workgroups that depended on multi-step workflow processes with large data files (e.g., rich media).

But the value of SAN file systems is not limited to special-purpose applications. The benefits of file sharing (e.g., better data availability, optimized storage utilization, and easier management) are compelling for any data-intensive IT environment. Almost all enterprises share files at multiple levels or functional areas of the organization. The potential applications are as numerous as the businesses themselves.

SAN file systems and virtualization

A fair question to ask is: How does a SAN file system differ from virtualization? After all, don't they both enable storage sharing? In reality, SAN virtualization tools partition and provision storage, but they don't allow storage to be shared.

In all operating systems, there is an optional layer of software between the file system and the physical disks. This "logical volume manager" layer is responsible for creating virtual disk volumes with specific characteristics (e.g., mirrors, disks of varying sizes, etc.).

SAN virtualization allows users to pool SAN storage and then break it up according to specific user needs. However, it does not create virtual disks that can be shared among heterogeneous platforms.

For example, a virtual disk volume created for a Windows system cannot be accessed by a Sun Solaris server since the data on the disk is in the Microsoft file-system language (see figure). SAN file systems, on the other hand, allow heterogeneous servers to access each other's data and to use the same storage devices without allocation restrictions. They provide efficient use of storage resources and allow servers to access that storage at varying performance rates.

SAN/NAS convergence

Further confusing end users are virtualization products that claim to support both NAS and SAN (i.e., that support both file- and block-level data). In these systems, the block-level interface presents virtual disks to the SAN, provisioning the disks to specific servers and their respective file systems. A NAS appliance (configured with its own file system) serves as the file-level interface. The file system provisions storage in a manner similar to the other servers running on the SAN. There is nothing in this solution that allows the internal NAS system to share data with other servers on the SAN. Again, the storage is partitioned and provisioned. It is not shared.

True SAN/NAS convergence occurs only when there is a single copy of data that can be accessed by heterogeneous servers over the SAN and by clients over a TCP network through a NAS gateway to the SAN. Everyone needs shared access to the same stored data, which means that both the NAS gateway and the other servers on the SAN must be running a SAN file system.

With a SAN file system, the system no longer depends strictly on any one physical computer (i.e., server or appliance). Data becomes a resource that is available to the applications that need it—at whatever performance rate they can sustain. The entire island of storage is managed by a single, common system.

The future

SAN file systems have the potential to create opportunities to add data management features that can enhance the way that IT administrators use their storage resources. For example, SAN file systems can be linked to tools that allow users to manage different types of storage media or to policy engines that steer data to various locations.

Using SAN files systems with integrated data management tools, IT administrators will be able to better manage data life cycles, ensuring that data is adequately protected and accessible over its lifetime. A heterogeneous SAN file system allows users to change server platforms over time, independent of the data format of the file system. Together, these technologies provide a managed, protected repository for corporate data.

Paul Rutherford is vice president of software technology at ADIC (www.adic.com) in Redmond, WA.

This article was originally published on April 01, 2003