A Billion Files May Not Be That Hard After All

By Jeffrey Layton

I went to the Red Hat Summit the first week of May and attended a talk by Ric Wheeler from Red Hat. Ric is the file system manager for Red Hat and has been part of a group of people working on enterprise class features in Linux file systems. In this case, Enterprise Features does not mean things like snapshot, quotes, and the like. Rather, it means having good file creation and deletion performance as well as good file system check (fsck) performance.

Ric has been testing Linux file systems with a billion files to determine the performance and then working with the file system developers, such as Dave Chinner and Lukas Czerner to improve the file systems. In particular, he's interested in file system performance for up to 1 billion files in a file system.

This number may seem absurd to you, but 1 billion files is definitely possible considering we have 3TB drives and lots of network media on our desktops. I also know that many HPC sites talk about 1 Billion files as being common with a longer term goal of 1 trillion files.

At this year's Red Hat Summit, Ric presented some results for xfs and ext4 based on some work that started as a result of last year's billion file talk. The results are startling:

  • XFS file creation and deletion:
    • Added delayed logging mode to xfs
    • 12x-17x faster for file creation
    • 21x-24x for file deletion
  • ext4 file system creation:

    • Added laxy inode initialization feature
    • 38x times faster

While it may not seem to be such a big thing to improve file creation and removal performance or improve file system creation performance, these aspects of file system performance can mean a great deal to administrators and improve overall performance.

Ric's talk is definitely worth a read.

This article was originally published on May 18, 2011