From storage virtualization to NAS-SAN convergence

Emerging technologies, coupled with virtualization, will blur the distinction between NAS and SAN, leading to a convergence in storage networking.

By Chris Bennett

With end users increasingly demanding the simplification of storage architectures, universal access to data, and a reduction in management complexity, many vendors have jumped on the storage virtualization bandwagon. However, each vendor has its own vision for, and definition of, virtualization. For analysts and the trade press, the definition of virtualization remains a moving target.

For end users, these varied viewpoints have only added to the confusion surrounding storage virtualization. If virtualization means different things to different vendors, what central principles can buyers apply when planning network architectures that could take advantage of virtualization? And if the great appeal of virtualization is its simplicity, why the confusion? Further, although virtualization is often discussed in a storage area network (SAN) context, it applies equally well to network-attached storage (NAS).

What is virtualization?
There should be nothing confusing about the concept of virtualization. One simple analogy is the way we access electrical power. We flip a switch, and the light comes on. We don't care about the mechanics of the switch, junction boxes, breaker panels, power poles, substations, or the power grid. We don't want to have to think about the electrical power infrastructure. We simply want power on-demand. Call it power virtualization.

To bring the same principle down to our desktops, Windows offers Virtual Disk, in which space on a hard disk drive performs as if it were a separate physical disk. And the Macintosh offers Virtual Memory, which allows users to use hard-drive space as if it were RAM. When using these functions, we don't care where the storage space or the memory resides.

Storage virtualization, in its purest sense, has the same kind of simple meaning—in this case, access to stored data without concern for the storage infrastructure and all the IT issues it entails. Consider a URL. What do users know or care about the infrastructure of hardware, software, and networking?

Companies want to be able to access their data without concern for the infrastructure. They don't really care what disks it's stored on or the physical location of those disks. They simply want fast, reliable access anytime. That is what storage virtualization should deliver. This implies the simplification of the entire process of storing, maintaining, and recovering data from a global storage infrastructure. Virtualization should allow disparate data, spread over storage systems throughout a network, to be managed as a logical pool—manageable as a single logical entity—and accessible universally as a single logical entity.

Not a new concept
From an engineering standpoint, virtualization as described above has existed for a long time. For example, NAS has always been inherently virtualized, because virtualization is necessary for accessing random files on a network. End users are not concerned with the physical location of storage systems, the number of disks they entail, or which specific disks contain the data they need. Users can assemble any number of disks to create a single volume and then divide that volume according to their needs. These storage-virtualization capabilities make users independent of the physical storage infrastructure.

Virtualization will play a major role in the ongoing evolution of storage networks. And virtualization may eliminate the SAN vs. NAS and block vs. file debates.

SAN environments have provided their own version of virtualization. Originally, disks resided inside servers, linked by a SCSI bus. They later migrated to a SCSI-linked array outside the cabinet but had to remain near the application server, limiting the ability for enterprise storage to be centralized. The arrival of Fibre Channel allowed many more disks to be connected to the server than was possible with SCSI and made it possible to move disks away from servers to a central location. But the server was still connected to specified locations on specified disks.

True virtualization goes beyond that by connecting the application server to a virtual disk. In the case of a virtual local disk, the server connects to a share point at which it has space to write data. The mechanics of virtualization—the allocation of disk space and the placement of volumes on physical disks—occurs at another level that is masked from the server. The user no longer cares where data is stored; access is what matters. This means that data can be distributed in a way that makes the most economical use of available disk space. Disk capacity can be easily scaled, and access is straightforward and fast. This is true storage virtualization, which is available today in the form of network-attached appliances.

What users want
For end users, the key question is: How can virtualization help me solve storage problems?

When companies are asked what they want storage virtualization to accomplish, their first answer is often that they want it to free them from the restraints of the physical infrastructure. To put it another way, end users usually find that the available disks are too large or too small for the job. One manager may have a box that's too big and wants storage virtualization to break it up into more useful sizes (this capability is of great interest to service providers and to enterprises that bill departments for the use of storage resources). Another manager may have boxes that are too small and wants storage virtualization to reduce administrative costs by enabling the management of a number of disks or filers as a single unit. Today's enterprise storage management software provides this capability, instantly supplying such information as the number of storage units, total stored data, and applications in use.

Easily managing logical views of storage is the basis for tremendous productivity gains both within IT and the end-user community. Additionally, virtualization offers significant cost savings. Efficient SAN environments typically employ one administrator for every 8TB to 10TB of storage capacity. For virtualized environments, it may be possible for one administrator to manage as much as 55TB.

NAS-SAN convergence
The NAS vs. SAN, block vs. file debate may be drawing to a close. For some environments and applications, it simply doesn't matter whether storage is file- or block-based. For many legacy applications, it is significant. Interestingly, new technologies are making it possible for storage appliances to virtualize both block- and file-based data while maintaining the appliance concept, which many IT administrators and end users find as a tremendous value proposition.

To see how this will happen, consider the two elements of file-based storage systems: file semantics, which addresses issues such as file access and management, and block allocation and management, which monitors and reports capacity levels and answers specific file questions such as the disk and location where a file is to be stored. The file system divides every disk into blocks that generally range from 512 bytes to 128KB. The file system keeps track of which blocks are or are not in use and also determines the most efficient locations for data. The system is by nature a virtualization engine, and it puts us well on the way to NAS-SAN compatibility or convergence.

For example, iSCSI-based technology combined with NAS architectures will enable the virtualization of block-based data. Data will be written by the applications server to what it sees as a SCSI disk, but the disk will be virtual; it could be of any size, in any location. Its real nature is masked from the application server.

As the industry brings these technologies to market, they will enable file access and block access to exist side-by-side on a single system. Storage appliances will be able to perform as both NAS and SAN devices. In these environments, the full benefits of storage virtualization—simplification, flexibility, universal access to data, and reduced management costs—can be achieved.

Chris Bennett is the director of product marketing at Network Appliance (www.networkappliance.com) in Sunnyvale, CA.

This article was originally published on April 01, 2002