RealTime IT News

Building a Better Filesystem For Linux

A new operating system filesystem is expected to inject new life into Linux's ability to search files, documents, e-mails and contacts more efficiently.

Oakland, Calif.-based Namesys (short for Naming System Venture) took the wraps off the latest version of its ReiserFS for licensing this week. Dubbed Reiser4, the so-called "Atomic Filesystem for Linux" posted benchmarks Monday showing it is 2 to 5 five times faster than the previous versions.

A filesystem is the method by which information is stored on disk drives. Different operating systems normally use different filesystems, making it difficult to share the contents of a disk drive between two operating systems.

The company claims it is the fastest filesystem for IO bound tasks that are not "fsync" intensive. Version 4 improves on the designs of the most stable version (Reiser3), which is the default filesystem for SUSE Linux, Lindows, FTOSX and Gentoo.

Lead architect Hans Reiser told internetnews.com he and his team took a bunch of technical gambles trying to solve the storage layer problem and got lucky with the results.

"What we've done is make it effective to store small files in the filesystem namespace. That means you can store things like phone numbers and other small bits of information without losing the traditional filesystem," he said. "Because file systems have not been space efficient, people don't tend to store files that are smaller in size. My belief is that you would see more files available if systems were more efficient."

According to Reiser, 80 percent of traditional filesystems are small files less than 8K (kilobytes) in size. Of the remaining 20 percent of those remaining files, 80 percent fit between 8K and 80K with only the largest files making up a fraction of the entire space. Reiser said his technology is more space efficient than other filesystems because its squishes small files together rather than wasting space due to block alignment.

"It also means that Reiser4 scales better than any other filesystem. Do you want a million files in a directory, and want to create them fast? No problem," he said.

Reiser said the next version of ReiserFS will work on the problem of semantics -- or matching seemingly unrelated bits of information and creating a whole picture.

"Before we could even tackle the semantics, we needed a storage layer that could store small and large files equally," he said. "If they could look the same in the namespace then you could have semantics that work faster." Reiser has also published a white paper outlining the next generation of his filesystem.

The reason this is all significant, according to Reiser, is that there is a move to put database and search engine technology into the filesystem name space.

After approaching the problem and then backing off of it, Microsoft is preparing its WinFS system for its next generation of Windows (code-named Longhorn). WinFS is expected to combine the indexing capabilities of a SQL server relational database and the file labeling (metadata) potential of XML.

Likewise, Apple Computer has tackled the problem in its own way. The Macintosh-maker's "Spotlight" technology in its upcoming Mac OS X "Tiger" release is an integrated metadata search engine that claims to search 100,000 bits of information in several different formats.

Linux currently supports multiple filesystems, making it possible, for example, to read/write a partition dedicated to Windows. However, most computers are a jumble of chaotically organized information.

Reiser4 solves the problem by storing files in trees. Trees scale better to large sizes of directories and files than conventional filesystems. They also scale better to small sizes, because they avoid wasting space due to block alignment. Reiser said version 4 implements all filesystem operations using atomic operations, and does this without performance penalty due to the use of wandering logs.

Version 4 is also architected for military grade security complete with auditing capabilities and assertions to guard the entrance to every function.

The filesystem is also based on plug-ins, which means that it will attract many outside contributors. The ReiserFS project has been well supported in the open source community. Michael Robertson's Windows-Linux crossover software maker Linspire (formerly Lindows) sponsored Reiser4's debugging responsibilities. Novell's SUSE Linux has taken the task of journaling the latest version after Robertson's MP3.com originally tried to tackle the problem.

Of the dominant Linux distributions, Reiser said SUSE, FTOSX and Gentoo are the biggest supporters and he expects Debian to adopt Reiser4 and integrate it into the install process relatively soon.

As for Red Hat , Reiser said he's heard the enterprise Linux distribution considers any project that receives funding from SUSE as a competitive threat and therefore suspect. Red Hat has so far declined to participate in ReiserFS, instead adopting "ext3" as its journaling filesystem (JFS).

Reiser said he expects Version 4 should ramp up faster than version 3 because it has better written code and an extensive collection of test strips and that helped his team fix bugs, but it is still months away from becoming a fully stable release.

"It's stable for laptops," Reiser said. "But for mission critical servers, that is a wait and see."