Direct Access File System (DAFS)

The following Q&A session was excerpted from an online chat session hosted by searchstorage.com, a Web-based search engine site for storage-related articles and information. Questions were posted by end users. The respondents were Mitch Shults, director of business development of Intel's Enterprise Server Group, and David Dale, marketing manager and DAFS evangelist of Network Appliance.

What is DAFS?

Shults: DAFS is intended to substantially improve performance of file-sharing applications by allowing "direct access" across a data-center network between general-purpose servers and the filers that support them. It's an open, interoperable specification.

How does DAFS differ from conventional network file access protocols?

Dale: Traditional shared-file systems rely on general-purpose network protocols and interface cards to provide applications with access to files. The network protocols break data up into small packets, with all the resulting overhead of packet assembly/disassembly done in software. The resulting data is then buffered in the operating system kernel to control flow, providing general-purpose access. Applications accessing this data must perform context switches to the operating system to access the network buffers. DAFS, on the other hand, uses VI architecture to provide a high-perfor mance solution to file sharing. Optimal data packet sizes are negotiated and assembled/disassembled in hardware. DAFS also bypasses kernel buffering and context switching by writing data directly into application memory space. The result is a fast, highly scalable improvement on traditional shared-file systems.

Are there any projections for DAFS performance versus NFS or CIFS?

Dale: We expect a significant performance increase over NFS and CIFS. In fact, DAFS should provide better performance than direct-attached block-oriented devices.

Is DAFS complementary to iSCSI, or is it tied to VI and incompatible with iSCSI?

Shults: DAFS is complementary to the various SCSI encapsulation mechanisms that exist or are being developed. SCSI encapsulation over any transport (including VI) is a block-mode mechanism. Resource sharing is not provided. DAFS, like existing file-sharing mechanisms, is inherently about resource sharing. All DAFS is doing is allowing file-sharing mechanisms to take full advantage of the greatly accelerated performance of data-center networks such as InfiniBand, Fibre Channel, Gigabit Ethernet, and proprietary interconnects, all of which support a VI-based interface. DAFS uses VI, but that isn't a source of incompatibility.

Do you think DAFS will eventually obsolete protocols such as CIFS and NFS?

Dale: No, we see them as complementary. DAFS is designed specifically for a secure bounded network (e.g., in a data center). NFS and CIFS are designed for larger-scale networks and WANs.

If you are using VI architecture, is it safe to assume you are also using a storage area network (SAN)?

Shults: Yes. VI architecture assumes a reliable interconnect. That's what provides its performance advantage, since VI-based networks don't have to use heavyweight protocols such as TCP/IP. VI-based networks typically provide hardware-based reliability protocols to ensure everything that's sent actually arrives in order. VI also allows senders to place data directly in the recipient's memory, without requiring any action on the recipient's part to accept the update. Hence, "direct access."

So in addition to the NFS and CIFS protocols used on Network Appliance filers, would you also be implementing DAFS on the filers? And your hope is that the rest of the Unix and NT world would also implement it?

Dale: Yes, this is another protocol that will be supported on filers. To garner broad industry support, we started the DAFS Collaborative.

Who is in the DAFS Collaborative besides Intel and Network Appliance?

Dale: More than 50 companies have signed up as contributors to the DAFS effort, ranging from NIC (network interface card) vendors to storage, server, and software vendors. There is more info on the Website at www.DAFScollaborative.org.

When will DAFS run over Ethernet?

Shults: VI-supporting Gigabit Ethernet adapters will be available this year. The key thing needed is off-host TCP/IP capability. With a conformant VI interface, the DAFS client and server software stacks don't have to know or care if the interconnect is Gigabit Ethernet, InfiniBand, or anything else.

What constitutes "direct" access between a processor and a disk on a filer? And how is it different from what's available today?

Dale: Today's filers attach to applications servers over standard TCP/IP networks. DAFS enables connection over a VI connection, which allows memory-to-memory transfers between the filer and the application.

What do you see as the applications most impacted by DAFS?

Dale: We anticipate big benefits for DBMSs (database management systems) and applications that run directly on database engines, in addition to any file- oriented application such as collaborative applications and Internet apps (mail, streaming media, etc.). Our focus is initially on data-center and "Internet data-center" applications.

Who will benefit from using DAFS?

Shults: Any system that provides scalable application service can benefit from using DAFS. One example would be a set of diskless Web servers connected to one or more file servers that store Web information. Another example is a cluster of diskless servers running a highly available shared database that uses a file server to store database information. DAFS is primarily designed for clustered, shared-file network environments, where a limited number of server-class clients connect to a set of file servers via a dedicated high-speed network. The sweet spot for DAFS is where these "client" systems are centrally located and managed and able to take advantage of VI's fast data-transfer semantics.

What network infrastructure does DAFS use?

Dale: DAFS uses VI as its underlying transport mechanism. VI can be implemented on a variety of network infrastructures. Currently, there are several implementations of VI that use the VI over Fibre Channel draft standard and others that use proprietary interconnection networks. In the future, the InfiniBand standard will support VI. VI over TCP/IP is also expected, which will enable the use of gigabit and 10-Gigabit Ethernet networks.

How will applications take advantage of DAFS?

Dale: DAFS will be implemented as a dynamically linkable library. The DAFS library will make use of a VI provider library by the vendor of the VI hardware. Applications can either use direct calls to the DAFS library using an OS-independent interface similar to normal OS I/O requests or use normal OS-dependent I/O calls through a transparent adaptation library.

Do application programmers have to use any special APIs, or do you believe that normal APIs such as Windows function calls will have some interface underneath?

Shults: Application developers will have the option of using three distinct modes for DAFS. The simplest mode is complete transparency. The application does file operations using standard OS API calls, which go through the standard driver stack-a DAFS driver within the kernel at the lowest level. This mode is appropriate for many applications but does not provide the best possible performance increase.

Isn't the basic idea centered on VIA (VI Architecture), and if so, can't the same performance improvements be realized for block- oriented I/O transfers? In which case, the idea that DAFS will be faster than block-oriented solutions seems very dubious.

Shults: The goal with DAFS is to ensure file-sharing approaches are fully competitive with block-mode I/O from a performance standpoint. Which will be "faster" is a function of the application. Today, all file-sharing approaches work over conventional networking transports. They're at a profound performance disadvantage vs. block-mode I/O for this simple reason. Removing the heavyweight protocol requirement and multiple buffer-copy requirements from the transport layer provides significant acceleration for file-sharing operations.

Better performance than direct-attached block-level? Using a probabilistic Internet Protocol (IP) network? What did I miss?

Shults: Conventional TCP/IP Ethernet is a probabilistic network. Lots of vendors are working on fixing that, especially with the iSCSI efforts. VI doesn't care what the underlying transport is, as long as it's reliable.

How does DAFS improve system performance?

Shults: DAFS improves system performance by allowing applications to bypass operating system control, buffering, and heavy-duty network protocol operations that tend to bottleneck I/O throughput. Direct-access I/O results in low latency and low overhead data sharing, which increase system scalability and performance.

How does DAFS handle file locking and permissions?

Dale: Again, I can point you to the spec for this: www.DAFScollaborative.org.

What, if any, are the expected sources for overhead?

Shults: Using the most optimal approach, described previously, the overhead is minimal. The client formats a DAFS request block in memory, including a data structure for payload information (or a receive buffer, if applicable), and invokes the user-level I/O library. That library, without a kernel transition, uses VI Architecture to communicate directly with the network transport hardware and from there to the DAFS filer's memory and I/O queue. Upon receipt at the filer, an I/O request is queued, and the filer gets to the request based on a polling cycle, so there's no ISR overhead. The only overhead in this case is setup time, VI communication time, network transport time, and filer parse/service time. The sum of these elements will be far less than the overhead of processing a TCP/IP stack on the host, and with the right I/O hardware, there are no stalls related to memory-mapped I/O and interrupt handling.

Where can I find white papers on DAFS?

Dale: You can check www.DAFScollaborative.org.

How does DAFS differ from conventional file systems with direct-attached storage (DAS)?

Dale: Traditional file systems that use DAS are implemented inside operating system kernels and use kernel buffers to cache file-system data. Application file requests must first make an operating system call and are satisfied by copying the data from the buffer cache to application buffers. In addition, sharing data with other machines requires complex proprietary software and complex data management. DAFS, on the other hand, provides direct application data transfer and implicit data sharing using an open protocol and simplifies data management.

Where do you see InfiniBand fitting into DAFS?

Shults: InfiniBand fully specifies a link, a switch, a host channel adapter, and adapter form factors. InfiniBand does not address higher-level transports and applications. DAFS is an example of a higher-level transport and an application, together. Therefore, it's beyond the scope of the InfiniBand Trade Association.

How could DAFS improve system scalability, availability, and manageability?

Shults: DAFS is designed to improve the efficiency and resiliency of shared file access in a clustered, system area network environment. Separating computing resources from storage resources allows each to be scaled and managed independently. Computing resources can be scaled by adding diskless commodity servers. Storage can be scaled either by adding more storage to existing file servers or by adding more file servers. Application servers see a common shared pool of storage. Availability improves because the architecture supports simple application server fail-over. File servers can be set up to fail-over also.

How does DAFS coordinate file sharing between systems?

Dale: DAFS is designed to allow high-speed, fault-tolerant, consistent views of files to a heterogeneous environment of servers that may be running different operating systems. The specification provides a consistent, cached locking mechanism that tolerates client or file-server failures and fail-overs. DAFS also provides user authentication and cluster node access control for security.

Can DAFS be used over a WAN or for general-purpose file sharing?

Shults: To meet high-performance criteria, DAFS is designed for high-throughput, low-latency networks connecting a limited number of client machines that are centrally managed and have a certain level of trust between systems. DAFS may be inappropriate for some wide-area, general-purpose file-sharing environments.

Why does the DAFS Collaborative elect not to address security, limiting the architecture to bounded networks?

Dale: We are addressing security. If you go to the Website you can download the spec and get more specifics.

Is there any value in considering a DAFS/IP option? Why is DAFS so closely coupled with VI?

Dale: VI gives us two things: memory-to-memory RDMA capability and transport independence. Initially, DAFS will run on Fibre Channel and Gigabit Ethernet. Subsequently, we'll see 10-Gigabit Ethernet and InfiniBand.

At a high level, what security features will DAFS support? The native security features of NFS, NT, others?

Shults: DAFS does not bypass existing OS security mechanisms. If anything, it improves them. The DAFS approach is to use existing OS mechanisms for validating access and update rights, typically via ACLs. DAFS provides significant improvements for guaranteeing shared-file update integrity when many clients are accessing the same file, but that's independent of ACL checking.

Do disk vendors have to do anything special to support DAFS?

Dale: No.

When can we expect the first implementations of DAFS, and for what platforms?

Dale: Vendors in the DAFS Collabora-tive are targeting the second half of this year for initial products. Platforms currently include Unix (Solaris and Linux) and Windows 2000.

Is DAFS more oriented toward large-block sequential transfers than small, random I/Os?

Shults: File-sharing generally tends to be more oriented toward large-block transfers than small-block transfers. DAFS will support both modes, of course, but there's inevitably more overhead per byte for small-block I/O than for large-block I/O in a file-sharing scheme.

We've been really reluctant to invest in a second network infrastructure-Fibre Channel-to build a SAN. How will DAFS work on Gigabit Ethernet? Will it require DAFS-enabled Cisco boxes, new hardware, etc.?

Dale: The VI-over-Ethernet NICs will work with standard Ethernet infrastructure (cables, switches, etc.).

With VI over Ethernet, would you be able to share the same adapter/network with conventional IP-based Ethernet traffic, or would you end up with a standard Ethernet network and a DAFS network alongside it?

Shults: This is an issue for a particular IHV implementation. There's no technical reason why an IHV could not simultaneously support "conventional mode" and "VI-mode" communications through the same adapter. In a real-world environment, I'd probably recommend having two adapters-one for conventional IP and the other for VI-since the internal paths are inevitably different, and switching between them will cause overhead. An IHV may come up with cool tricks that make that advice wrong, however.

Why choose an architecture that requires re-linking at the application level, rather than an application-transparent architecture?

Shults: The choice isn't forced. If you don't want to re-link, use the kernel-level model, with modular drivers, as described previously. Absolute best performance needs user-level I/O, which requires a minimal amount of modification to source and re-linking.

Are there any benefits over Fibre Channel when using transparency mode?

Shults: Not for block-mode I/O. If you're using TCP/IP over Fibre Chan-nel, which many vendors support, and you're using that link for file-sharing access, then you're suffering the performance hit of IP overhead. DAFS, using a VI-compliant Fibre Channel adapter, gets rid of that overhead.

When will DAFS be available, and who are the VI vendors?

Dale: Expect to see client-side and server-side DAFS products in the second half of this year. NIC vendors who have already publicly announced NIC plans include Troika Networks and Giganet (which is being acquired by Emulex).

How do you expect DAFS to perform vs. conventional network-attached storage (NAS) solutions connected by the next generation of intelligent NICs that implements the IP stack in hardware?

Shults: We'll have to see. My bet is that DAFS will be faster, over the same NIC, than conventional sharing mechanisms because it's specifically optimized for a data-center transport; things like timeout intervals are optimized for extreme low latency.

Why is DAFS so important to the data center of the future?

Shults: Customers build Internet data centers to get useful work done. That means processing transactions, serving up Web pages, and generally doing value-added things. Economically, the goal is to get the most work done with the highest performance at the lowest cost. Today's data-center networking and file-sharing approaches are not as efficient as they could be. That means that users are buying more servers than they need to get the required work done. And they're also not getting the best possible performance from those servers. DAFS, in combination with next-generation servers built around the InfiniBand architecture, should enable a much higher level of efficiency than today's data centers can achieve.

If an application uses the "user-level" DAFS API, it will not be able to use the local buffer cache. Don't you think that this will be a performance problem for reads?

Shults: Correct. It won't be able to use the same API for the OS-provided local buffer cache. But the whole idea is that the data is essentially cached on the filer and then transmitted on request to the user-level application I/O buffer for DAFS resources. With a data-center-class network transport, the latency for these I/Os could often be less than that for local I/O.

What OSI layer does DAFS live on?

Shults: Some aspects of 2, all of 3, and some of 4 in some models.

How does DAFS improve performance if used in a Gigabit Ethernet TCP/IP environment?

Shults: If the Gigabit Ethernet adapter doesn't support VI, then DAFS won't work at all. If it does support VI, then acceleration occurs by bypassing OS mechanisms and going directly to user space (optionally). Even without user-level I/O, there are still significant benefits for VI just from a data-transfer efficiency standpoint.

If, as you say, DAFS is expected to run on a SAN, what are the possible applications that you see benefiting from using DAFS vs. NFS/CIFS?

Shults: While there are a lot of niche examples, let's focus on the biggies: Web page serving and e-mail serving. Static Web pages are just files. Internet data centers put those files on file servers and configure hundreds of front-end Web servers to access those files in a shared manner with file servers. DAFS will get those files to the servers much faster and more scalably than existing approaches. In many e-mail servers, every e-mail and its associated attachments is a file or collection of files. It's efficient to partition the file I/O functions from the message-formatting and message-transfer functions. DAFS-based solutions will be able to do that faster and more scalably.

When will DAFS products be available, and what systems architecture will be supported?

Shults: The draft specification is available now on the DAFS Collaborative Website. Our goal is to have products this year or early next year. DAFS is independent of system architecture. Nothing prevents Sun, for example, from implementing DAFS (client or server) on SPARC/Solaris. You'll have to ask them about their specific plans.

Dale: We expect to see Unix (Solaris and Linux) and Windows 2000 clients first, followed by others.

Will all storage vendors become DAFS-compatible when the DAFS standard is incorporated into an operating system?

Dale: That really depends on where the vendor is focused. Our goal is to establish DAFS as a standard. I would expect any vendor with a NAS strategy to be seriously considering supporting DAFS. However, support from operating system vendors doesn't automatically mean that they will. But storage vendors are very strongly represented in the DAFS Collaborative, so we expect to see broad adoption.

Do you see DAFS replacing NFS/CIFS in the future?

Dale: No. I believe they are complementary. NFS was originally devised to provide (Unix-based) file services to thousands of client workstations over a LAN. NFS 4.0 extends that to the WAN. Similarly, CIFS is designed to provide (Windows-based) file service to thousands of client workstations over a standard network. DAFS, in contrast, provides low-latency high-performance file service to applications servers (the clients in this case-which would typically be in the hundreds rather than thousands) in a data-center environment. Whereas NFS is optimized to map file-system semantics onto a TCP/IP network, DAFS is optimized to map file-system semantics onto a VI fabric. Both are needed.

This article was originally published on February 01, 2001