The following Q&A-with Marc Farley, author of and a storage industry commentator-was excerpted from an online chat session hosted by www.searchstorage.com, a Web-based search engine site for storage-related articles and information. Questions were posted by end users.
You've said that all storage products fit into three main areas. Could you elaborate on that?
One of the purposes of this chat is to introduce a new method for understanding and analyzing storage network products and technologies by breaking down functionality to three well-defined areas: wiring, storing, and filing. Isolating these three functions makes it possible to cut through the propaganda that inevitably occurs when vendors or industry groups vie for market share. All storage networking products can be broken down into these three functional components.
What do you mean by "wiring?"
"Wiring" encompasses the transport technology-whether it's bus technology or networking-and also includes software components and network/bus hardware and cabling. Storage networking has some unique wiring requirements that should be understood clearly without being overly hyped.
What are the important differences in network wiring used for storage?
Latency and flow control.
Is there more to "storing" than disk drives and RAID subsystems?
Storing also involves protocols such as serial SCSI, which means that HBAs [host bus adapters] and device drivers are also involved. Volume management software is also storing technology.
What products are in the "filing" category?
File systems and databases, mostly.
Why are file systems and databases considered filing products?
They both represent data and manage the placement of data in logical block addresses.
Does "filing" have to do with storing the physical files or data on a storage device?
Filing provides the structure for storage. This can be external representations of storage in folders and directories, for instance, or it can be the data structures on storage devices and volumes that are used by the system to manage access to data.
Can you elaborate more on the three areas?
"Wiring" is all the software and hardware involved in transport. "Storing" is all the hardware and software pertaining to block storage operations. "Filing" is software (file systems and databases) that represents data externally and structures it internally.
How do SAN and NAS apply to this model?
SAN [storage area network] is the application of storing on networks, and NAS [network-attached storage] is the application of filing on networks.
What about the terms SAN and NAS?
They don't do a very good job of describing the functions. Both terms were developed to help market products, but that doesn't mean they help understand how products work.
I just read Building Storage Networks. How much do you think that SAN technology has changed since you wrote the book?
Overall, I think the technology is getting better, but it still has a long way to go. For example, I do not think that interoperability is even close to where it needs to be to make the technology truly open.
What about standards? Every SAN solution is different.
Fibre Channel standards are still too weak for a thriving industry, and this is hurting Fibre Channel. Can iSCSI do better? We'll see.
I've heard a lot recently about IP-based storage. What's your view on this?
It's not a question of "if," but "when." IP-based storage networking is inevitable. There is too much IP equipment and skills in the market. By comparison, the skill set for Fibre Channel is quite small. But IP switching has to be re-engineered for lower latency to support transaction processing.
When will new technologies such as Gigabit Ethernet and InfiniBand be widely adopted, and how will they affect storage technology?
Gigabit Ethernet is already seeing good adoption rates, and InfiniBand will see fast adoption when servers start using it as the system I/O bus in a couple years. For storage, expect Gigabit Ethernet to start making its mark by year-end. InfiniBand is still three years away as a native storage technology.
What's your opinion on SoIP versus Fibre Channel?
SoIP provides Ethernet IP networking, as opposed to a Fibre Channel fabric. It has management benefits, and companies can find skilled people for Ethernet/IP. With Fibre Channel, it's hard to find skilled people, but Fibre Channel has better flow control and latency than Ethernet/IP today. SoIP does not erase the need for Fibre Channel; it just replaces the Fibre Channel fabric with Ethernet/IP switches.
Will Gigabit Ethernet and InfiniBand be complementary or substitute technologies?
They'll be complementary.
Where does serverless backup fit in, and what is it?
Serverless backup could be done on a storing basis for cold block-based backup. Otherwise, most backup is file- oriented to enable individual files to be restored. Serverless backup must query the filing "stuff" like a file system to get the block storage locations where data is stored. Then a data mover in the serverless backup system reads data from those blocks and writes it to tape. There is an enormous amount of complexity involving caches and the ability to trap updates, etc.
Some storage management software/hardware providers are promoting the data virtualization features of their SAN solutions as the next big phase in SANs. Can you comment on virtualization and its benefits?
Data virtualization is mostly the translation of logical block addresses and representing these translations to file systems or databases. Virtualization is flexible and provides new ways to use storage products. But it's not clear that it is going to be a long-term trend because it can add latency and failure points in a SAN.
What do you think about the various storage virtualization products, and where do they fit in your categories?
Storage virtualization can be done anywhere in the I/O path: volume managers, device drivers, HBAs, storage domain controllers, and subsystems. As a tool for managing bulk storage, virtualization products are terrific. The thing to watch out for is the application you run through them and how much latency they introduce. If you have transaction processing applications, a virtualization appliance could double the latency between server and storage.
How does virtualization work with storing?
Virtualization is the process of translating logical block addresses and presenting them as addresses and LUNs to upstream filing systems.
Can SANs succeed without virtualization?
Virtualization is already a part of RAID and mirroring, but I think you're asking about a different kind of virtualization product. Yes. SANs can succeed without virtualization. But SANs won't succeed if they do not become easier to understand and install. They are still too complex.
Data management tools are very important to achieving a functional SAN. What should I look for in SAN management software?
You should look for reliability, high availability, and integration between filing and storing elements.
Who would you say is the leader today in SAN management software?
I would say Veritas.
What do you think about the maturity of SAN management solutions?
There's not much maturity. The question is whether they solve problems or create new ones.
Currently, who do you feel is winning the "race"-SAN or NAS?
Neither. It's not really a race. SAN provides storing and NAS provides filing. They are different, and they serve different purposes.
Is NAS a good replacement for Novell File and Print services?
Possibly, but not the printing part. Most NAS products do not provide print spooling and spool management. Also, it depends on the NAS product and what your needs are. NetWare has a pretty good file system, especially for heterogeneous storage. An inexpensive NAS product probably won't provide the cross-platform locking that you might want. Also, backup and recovery is probably going to be easier with a NetWare solution but you need to verify it with the NAS vendors you are talking to; some of them have pretty good recovery and data-protection solutions.
What's the main difference between NAS and SAN?
NAS is primarily the application of filing over a network. It also involves operations on files and data objects as well as byte ranges within files. SAN is primarily the application of storing and involves operations on logical block addresses.
Backup seems to be the biggest bottleneck for NAS and SAN. Do you see any improvement in backup?
Backup, as it is practiced today, is almost hopelessly broken. Other strategies will need to be adopted, using snapshots and backup together.
Can you discuss the role of storage service providers (SSPs)?
SSPs are very important to Internet data center operators. However, it's not clear what value they can really provide to data-center operators in traditional data centers. The problem of low-latency bandwidth has to be overcome. Then there is the question about where filing intelligence is to allow file-based data management to take place.