The Future of Storage Is Networked
The next step beyond switched and arbitrated-loop configurations may be Public Loop Fibre Channel.
By Ellen Lary
When you ask information technology (IT) professionals for their storage technology "wish list," the future sounds remarkably like the past. IT professionals want to reduce the cost of deploying and managing storage. They want to easily reallocate storage among applications as needed. Finally, they want to improve the storage infrastructure without disrupting it: evolution, not revolution. So, what else is new?
Paradoxically, the shift to client-server computing, which made deployment of applications easier, has caused headaches for IT professionals who manage storage. Before client-server computing became widespread, applications and their data were physically centralized in mainframe computer centers. This centralized storage and data could be managed by a central staff that only had to deal with one storage vendor and a small set of management tools. The client-server model resulted in applications and data becoming physically decentralized--creating "islands of computing" scattered throughout the corporation, each with its own storage hardware and storage management software.
So, what else is new is that IT professionals are now demanding a way to let them reduce the costs and increase the utility of their storage systems while retaining the benefits of the decentralized application model.
What they are describing is network storage, which entails the creation of enterprise-wide storage pools based on physically distributed, yet centrally managed, storage servers. These storage servers provide virtual disk volumes to the application servers that need them.
While the term is relatively new, network storage as a concept has been around for some time. Technically speaking, a simple file server provides network storage for individual desktop users. The problem with the file-server paradigm is its inadequacy in meeting the latency and bandwidth requirements of today`s high-performance application servers. Eliminating this inadequacy requires substantial upgrades to the storage servers, and especially to the network used to connect application servers to the storage pool. For many sites, the solution may be Fibre Channel storage networks, sometimes referred to as storage area networks (SANs). Compared with other interconnects, Fiber Channel advantages can include cable length, bandwidth, latency, and device connectivity.
Fibre Channel vs. SCSI, LANs
The dominant form of storage interconnect today is SCSI, which is used to connect almost 90% of all corporate storage. In terms of distance and bandwidth, SCSI interconnects tend to be short and wide. That is, they don`t stretch very far (less than 200 feet), but can carry a relatively large amount of data (40MBps and, soon, 80MBps). SCSI interconnects are low in connectivity (16 total nodes on a SCSI bus), but they offer low latency (e.g., there is relatively little time between a request for data and its fulfillment).
The other form of storage interconnect in use today is the traditional LAN, which is significantly different from SCSI interconnects. LANs cover long distances, but their bandwidth can`t match SCSI`s. LANs can connect thousands of nodes together, which is not possible with SCSI. The major problem with LANs is that they exhibit high latency. That may be acceptable for individual users, but application servers require more rapid access to data.
LANs exhibit high latency when used to access storage because they are designed as general-purpose networks, using many of the same protocols and design techniques as wide-area networks. Router delays, CSMA backoff protocols, and software message-processing overheads in the source and destination nodes slow down access to storage via corporate LANs. These problems can only be solved by changing the model of the network to optimize it for direct access to storage. This is the essence of Fibre Channel. In addition to providing data rates of 100 MBps, Fibre Channel incorporates storage protocols that have been tuned to the properties of the physical interconnect to remove many of the transmission delays and virtually all of the software processing delays incurred by LANs as a storage access medium.
Switched and Arbitrated Loop
The two dominant implementations of Fibre Channel networks are switched and arbitrated loop. Switched Fibre Channel is based on the concept of a fabric. Users (nodes) connect into the fabric, which routes data between nodes. Today`s fabrics consist of a small number of very high-speed, low-latency switches that allow simultaneous conversations across their ports, thereby scaling fabric bandwidth in proportion to the number of nodes connected to the fabric. However, fabric connectivity is a problem because today`s fabrics can connect to only a few dozen nodes.
Fibre Channel Arbitrated Loop (FC-AL) connects up to 126 nodes in a one-way ring. FC-AL, also known as a "private loop," uses central wiring points called hubs to bypass failed nodes and links in the ring for greater failure tolerance. The weakness of FC-AL is its bandwidth. While FC-AL can physically connect more than a hundred nodes, its bandwidth is a constant 100 MBps, independent of the number of nodes connected. This limits an FC-AL loop to connecting a small number (3 to 6) of powerful nodes, such as application servers, to a few dozen individual disks or a few powerful storage servers.
Neither switched Fibre Channel nor FC-AL alone provides the ideal network. Combining these two architectures, however, is a different story.
Public Loop Fibre Channel
"Public Loop" Fibre Channel, which combines switched Fibre Channel and FC-AL, is the next evolutionary step toward the ideal storage network. Rather than connecting individual nodes to a switched fabric, a public loop architecture connects entire loops. Rather than dozens of nodes, you now have dozens of loops, each with a dozen or so nodes. The result is the superior connectivity of FC-AL combined with the high bandwidth of switched Fibre Channel, offering gigabytes-per-second bandwidth, shared across hundreds of nodes, with low latency.
Another advantage of a public loop is its scalability. If a department acquires a new application server that does not need huge bandwidth to storage, the storage administrator can put the application server on a loop with many other similar servers. As the application server requires more storage bandwidth, it shares its loop with fewer servers. It can also be given its own dedicated connection into the fabric for higher-performance storage needs. A data warehouse, for example, may be given several connections into the fabric to boost performance.
Public Loop Fibre Channel comes closest to providing the ideal interconnect for network storage. Obviously, it improves performance, but it also enables fundamental shifts in the storage structure itself. Public Loop Fibre Channel allows IT managers to distribute a storage pool of many tens of terabytes over a network of servers, yet manage it as though it were all in one room in one big box. It lets you change the way you back up data, from saving files to saving entire volumes. It allows you to create virtual local disks, and to take advantage of economies of scale by buying storage in bulk, yet apportioning it in gigabytes.
Public Loop Fibre Channel allows you to manage, share, and get more value from data. Initial implementations of Public Loop Fibre Channel networks are expected by the end of the year as add-ons to existing Fibre Channel switches.
Economies of Scale
A great deal has been said about the performance advantages of network storage. Another advantage is economies of scale.
For example, with network storage, enterprises do not acquire more disks than they need, scattered in small underutilized pockets throughout the company. Instead, storage purchases are made centrally by a staff that is dedicated to and trained in managing the storage pool, which results in greater efficiency. Investing in high-capacity centralized tape libraries for the pool, which are much more cost-effective than multiple distributed small libraries, further reduces capital as well as ongoing labor-related costs.
Network storage can fulfill the wishes of both system administrators and users. It reduces the cost of managing storage, and decouples the storage solution from the computing platform, preserving storage investments as computing platforms change.
Fibre Channel performance is limited only by the capability and topology of its switches. Today`s Fibre Channel switches are only first-generation technology. Subsequent generations of switch technology will realize the full potential of Fibre Channel data transmission. With the advent of larger and faster switches, expect to see an old idea--the logical recentralization of data--breathe new life into the way we use and store data.
One of three Fibre Channel topologies, a fabric uses switches in place of hubs to grant multiple servers access to the same storage subsystems.
Fibre Channel nodes can be directly connected to one another in a point-to-point topology. Because each node acts as a repeater for every other node on the loop, one disconnected node can take down the entire loop.
Cabling two fully independent, redundant loops is one method of providing full redundancy in a Fibre Channel environment. This dual-loop cabling scheme provides two independent paths for data with fully redundant hardware.
Other Enabling Technologies
A number of enabling technologies have been important factors in the evolution of network storage, including I2O, disk virtualization, and snapshot storage.
I2O: Basically a port architecture that eliminates the need to load a new driver for every new device, I2O enables universal access to storage, across diverse platforms. As the network interfaces evolve, or the network storage servers are upgraded, the I2O architecture allows the I/O drivers for the application server operating systems to remain the same; this decouples storage network evolution from the operating system release cycle.
Disk virtualization: Disk virtualization is the key to economies of scale. It allows the network storage system to parcel out storage as needed, not disk by disk, but gigabyte by gigabyte. All the application server needs to know is that if it asks for ten more gigabytes, the intelligent storage server will provide it. It may provide it out of its own disks, or it may transparently relay the request to another intelligent storage server with more unallocated storage.
Snapshots: Snapshot storage is the ability to take snapshots of a disk, at any point in time, and then back up those snapshots to tape, or keep them online for fast access to old copies of data. A snapshot operation creates an instantaneous read-only copy of the application`s storage, which is effectively frozen in time at the instant the snapshot is taken. The application`s storage continues on past the snapshot operation as though nothing has happened, providing uninterrupted storage access to the application, while the read-only copy provides a stable version of the data to be backed up.
Ellen Lary is vice president, storage systems, at Digital Equipment Corp., in Maynard, MA.