Quantcast File System

Quantcast File System (QFS)
Stable release 1.0 / September 27, 2012
Development status Active
Written in C++
Type Distributed File System
License Apache License 2.0
Website quantcast.github.com/qfs/

Quantcast File System (QFS) is an open-source distributed file system software package for large-scale MapReduce or other batch-processing workloads. It was designed as an alternative to Apache Hadoop’s HDFS, intended to deliver better performance and cost-efficiency for large-scale processing clusters.

Design

QFS is software that runs on a cluster of hundreds or thousands of commodity Linux servers and allows other software layers to interact with them as if they were one giant hard drive. It has three components:

In a cluster of hundreds or thousands of machines, the odds are low that all will be running and reachable at any given moment, so fault tolerance is the central design challenge. QFS meets it with Reed–Solomon error correction. The form of Reed–Solomon encoding used in QFS stores redundant data in nine places and is able to reconstruct the file from any six of these stripes.[1] When it writes a file, it by default stripes it across nine physically different machines — six holding the data, three holding parity information. Any three of those can become unavailable; if any six remain readable, QFS can reconstruct the original data. The result is fault tolerance at a cost of a 50% expansion of data.

QFS is written in C++, operates within a fixed memory footprint, and uses direct I/O.

History

QFS evolved from the Kosmos File System (KFS), an open source project started by Kosmix in 2005. Quantcast adopted KFS in 2007, built its own improvements on top of it over the next several years, and released QFS 1.0 as an open source project in September, 2012.[2]

References

  1. QFS improves performance of Hadoop file system - Strata
  2. Quantcast releases bigger, faster, stronger Hadoop file system — Tech News and Analysis

External links