SSTable

SSTable (abbreviation of Sorted String Table) is an on-disk file format that represents a string-to-string mapping.[1] It is an immutable system so that once written, the map remains unchanged.

The file system stores key-value pairs that are sorted by the key. The write is done sequentially with an index at the end of the file. By creating the index, it suffices to keep the index in memory for efficient lookups. In order to load the table for random access into memory, a process will first seek the index, find the link to the data and read only necessary information from the disk.[2]

References

  1. ^ >"What is an SSTable in Google's internal infrastuct". http://hi.baidu.com/rsm219/home: rsm219的空间. 2010-12-03. http://hi.baidu.com/rsm219/blog/item/35642e81b3f03fc69123d992.html. Retrieved 2011-03-17. "SSTable stands for "Sorted String Table"." 
  2. ^ Adam D'Angelo. "What is an SSTable in Google's internal infrastructure?". http://www.quora.com/: Quora. http://www.quora.com/What-is-an-SSTable-in-Googles-internal-infrastructure. Retrieved 2011-03-17. "My understanding is that this is an on-disk file format representing a map from string to string. The (key, value) pairs are sorted by key, and written sequentially. At the end of the file an index is written which stores each key and the offset of its value. This way, only the index (the size of all the keys) needs to fit in memory to allow efficient lookups of any string in the table, even when the values might be much bigger than available memory. To load an SSTable for random access, a process will seek to the index, read it in (which can be a straight copy into memory or mmap), and only seek to values as necessary. SSTables are immutable, meaning that once they are written out, items cannot be added or removed from the map."