Wide area file services consolidate file storage in a centralized location and intelligently provide fast access to high-use files over a WAN.
By Brad O'Neill
In many respects, the branch office is still the "wild frontier" of enterprise IT: an uncharted, unknown territory best left to the adventurous. This frontier mentality is especially evident in the world of storage. There are more than one million corporate branch offices in the US, most of them using some type of file serving technology. Local file-level backup-and-restore processes are being conducted with varying degrees of rigor and success at most of these locations on a daily basis. Furthermore, tens of thousands of hours are spent daily solving technical and provisioning issues on all of those file servers. And every one of those file servers may be doubling in capacity every 12 months.
As a storage professional, if this scenario sends a reflexive shiver down your spine, you are hardly alone. Imagine being the IT administrator in charge of 300 branch offices, each having its own distributed infrastructure. Based on conversations Taneja Group has been having with enterprise end users, there is a dawning awareness that long-standing file-based data storage challenges of the geographically distributed enterprise can and should be improved. And it finally appears that a combination of vendor innovation and end-user demand are lining up to deliver new offerings that promise big changes.
Wide area file services
Developing a technology for reliable, "same as LAN" file access across multiple geographies has been discussed for many years. Attempts have thus far failed, however, due to technical problems such as overly complex file system architectures, unworkable proxy schemes, poor network optimization, and an overall mismatch with specific end-user requirements. Various technological advances have resolved most of these issues, and the new entrants in this space have worked with early customers to converge on a common method that makes sense: Consolidate the file storage in a centralized location and intelligently provide fast access to high-use files over the WAN through gateways at remote locations. We categorize players in this space as providers of solutions for "wide area file services."
Edge file gateways and central servers
Each wide area file services provider brings its own unique technologies to the table. What all of these players share in common is a distributed architecture where multiple 1U or 2U appliances are deployed in lieu of traditional file servers at each remote office location. We call these new appliances edge file gateways (EFGs). The EFGs can be fanned out across multiple geographies and maintain real-time communication over the WAN with a "central server" and its associated storage resources within the data center (see diagram). The central server then maintains responsibility for all permissions, access controls, data integrity, file management, and data protection at the remote locations.
Wide area file service architectures encompass central servers and edge file gateways connected over a WAN.
End users at the branch offices should never know that their IT infrastructure just underwent a significant transformation. Accessing files through the network mount on an EFG should feel no different than a traditional, locally resident file server. And therein rests the magic of the solutions that are coming to market: Vendors are providing transparent, enterprise-wide file serving over a WAN, using only cached data with none of the management and backup infrastructure previously required.
Functionally, EFGs may be thought of as intelligent file-caching nodes, analogous to Web caches that improve performance for static content delivery at the "edge" of Internet networks. However, at a technical level, the forward staging of dynamic file data represents challenges wholly unique from Web caching. The intellectual property created by the various players in this market segment addresses challenges such as block- and segment-level transfer management, CIFS and NFS protocol optimization, read-and-write caching, and data-contention resolution.
While the architecture of an EFG at the edge of the enterprise network is complex, the real value in wide area file services takes place back in the corporate data center where the gateways "fan-in" to the central server. Typically, the central server interfaces with a standard file server or network-attached storage (NAS) head, providing consolidated storage capacity to the entire distributed environment. With the "master copy" of all data stored centrally, the remote sites no longer need to have file servers or tape backup solutions deployed in each location. The result is that data and storage is consolidated, but remote users still have the benefit of local file services.
The central servers control all aspects of the file life cycle at the remote sites, with policy-based controls for deployment, configuration, management, and monitoring—all from one location. Furthermore, because the data used at the remote locations of the enterprise now remains resident within a centralized corporate data center and in its native data format, this formerly isolated data is now manageable within the established storage management framework that applied to the entire enterprise. This obviously includes replication and data protection. These advantages can constitute a significant increase in IT efficiency for a large organization that previously relied on a series of manual remote management operations.
The business need
A growing number of companies, particularly in financial services, retail, and technology, are becoming interested in wide area file services. The rationale is clear: By deploying simple EFGs and removing potentially hundreds of file servers and their associated data-protection hardware, software, provisioning, management, and break/fix issues, enterprises can tackle many pressing and costly issues in one fell swoop. Among the key business drivers are the following:
Storage consolidation ROI—Companies know that their file servers and related storage management expenses at remote locations represent a significantly higher percentage of their overall IT budget than they do for their centralized data-center IT budgets. Poor remote office utilization rates and duplicate purchases for each remote location are key culprits in the hidden costs of decentralized storage. By pulling those resources back to the data center, a significant ROI gain is readily achievable.
Based on our calculation of a hypothetical replacement scenario for a medium-sized enterprise with 75 offices replacing their branch office file servers and associated data-protection solutions with EFGs,
we believe an ROI gain on the order of $15,000 to $20,000 per branch office—or greater than $1.2 million per year—is possible using uncontroversial measurements. Accordingly, this is a message that resonates with CIOs and their overall storage consolidation directives.
Extended data protection—Companies are increasingly aware of the significant percentage of data that resides unprotected outside of their immediate centralized control. Taneja Group projects that in many companies, as much as 50% to 75% of their data may exist unprotected at "edge" locations. Some companies have already made corporate-wide policy decisions to address this issue.
According to Lou Gasco, director of standards and governance at MetLife, "One of our priorities is the enforcement of standardized data-protection methodologies across the entire enterprise for all of our business operations locations."
As this manner of standards mentality permeates through the end-user community, we believe many end users will begin to seriously consider wide area file services, as the entire range of decentralized data from all remote sites can be included for the first time within an existing enterprise-wide data-protection methodology. For companies serious about data protection across remote locations, this is nothing short of a quantum leap forward.
Regulatory compliance—Especially in the financial services and healthcare industries, we see significant advances in the sophistication of enterprises with regard to compliance planning. To this end, many companies are proactively establishing compliance practices that encompass decentralized data at their remote locations. The most effective way to meet regulatory requirements for decentralized data is to leverage a centralized data repository for archival purposes. Wide area file service solutions provide an approach to providing this mechanism to companies facing this challenge without facing the issues of coordinating a large number of independent remote backup operations across the WAN.
What to look for
After examining a number of players in this emerging market, a short list of key features and capabilities has become clear. Without these, very few enterprise customers will stop to consider wide area file services as a viable alternative to their traditional file server deployments at branch locations.
Centralized management—Web-based GUIs, integration with other SNMP management frameworks, and unified deployment, configuration, and monitoring of all gateways are must-have features. Without central control over all EFGs, the ROI gains of a wide area file service solution can be compromised.
Integrity controls—A wide area file service is only as good as the integrity of the data it provides. Accordingly, all solutions must provide quality of data in three areas: data coherency, data concurrency, and data assurance. Coherency controls should enable tradeoffs between local performance and global freshness of data. Concurrency controls must ensure multiple site locks on data. Assurance controls must validate files on "open" and flush buffers on "close" to prevent data loss.
Windows network integration—CIFS is a complex protocol and every vendor in this space must deal with it appropriately. Expect to see flexible naming resolution (NetBIOS, WINS, and DNS), Active Directory integration, and MS DFS support for logical and physical name mapping.
Security controls—In order to be foolproof, security must be centralized with no remote site administration required. Solutions should support transparent authorization and user authentication from the centralized authority.
Enterprise reliability—The true test of enterprise-caliber technology in this space is how it performs when the plug is pulled. Solid solutions need to have clustering/fail-over support and hardware redundancy, and still be able to operate with local data when disconnected from the centralized authority by WAN disruptions.
Scalability—Users should be able to achieve solid economies of scale through a high fan-out ratio. A single EFG should be able to support a branch office that typically has up to 100 people, and a single central server should support 50 or more EFGs. Furthermore, at all times, a single point of administration for the entire enterprise-wide deployment of EFGs is a must-have capability.
Meet the players
The vendors we have identified as participants in the wide area file services market include Actona Technologies, DiskSites, Riverbed Technology, and Tacit Networks. All of these companies are in varying stages of early product release. Expect to see major customer and product announcements from these players throughout 2004 that reflect the usage models and themes discussed above.
As this space matures over the next 18 months, expect larger players such as Cisco, EMC, IBM, and Network Appliance to begin turning in this direction, too.
For now, perhaps the biggest competition the early entrants face is "the world as it is today." Distributed enterprise file serving has been a problem for which companies have developed numerous work-arounds. Even if today's traditional file server architectures are inconvenient and very expensive, they are familiar and their quirks are well-known. Therefore, these vendors have their work cut out for them to make their technology known. However, if these players can shock enough customers, helping them save significant amounts of money along with vastly improved storage management and data-protection metrics, then we can expect a total reconsideration of the way many enterprises think about delivery of file data to the wild frontier of the "edge enterprise." q
Brad O'Neill is a senior analyst with Taneja Group, a research and consulting firm (www.tanejagroup.com) in Hopkinton, MA.
Vendors mentioned in this article