CAP theorem

In theoretical computer science, the CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:^[1]^[2]^[3]

Consistency (all nodes see the same data at the same time)
Availability (a guarantee that every request receives a response about whether it succeeded or failed)
Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

In 2012 Brewer clarified some of his positions, including why the oft-used "two out of three" concept can be misleading or misapplied, and the different definition of consistency used in CAP relative to the one used in ACID.^[4]

History

According to University of California, Berkeley computer scientist Eric Brewer, the theorem first appeared in autumn 1998.^[4] It was published as CAP principle in 1999^[5] and presented as a conjecture by Eric Brewer at the 2000 Symposium on Principles of Distributed Computing (PODC).^[6] In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a theorem.^[1] This last claim has been criticized however.^[7]

The proof of the CAP theorem by Gilbert and Lynch is a bit narrower than that which Brewer had in mind. The theorem sets up a scenario in which a replicated service is presented with two conflicting requests arriving at distinct locations at a time when a link between them is failed. The obligation to provide availability despite partitioning failures leads the services to respond; at least one of these responses shall necessarily be inconsistent with what a service implementing a true one-copy replication semantic would have done. The researchers then go on to show that other forms of consistency are achievable, including a property they call eventual consistency. Thus the theorem doesn't rule out achieving consistency in a distributed system, and says nothing about cloud computing per se or about scalability.

In contrast, Eric Brewer posited that CAP is often taken to preclude consistency for services running in the highly elastic first tier of a modern cloud computing system. These services are typically required to be stateless or to maintain only soft-state (cached data), and must be responsive even if inner-tier services are temporarily inaccessible. CAP, in this sense, uses "A" to mean immediately responsive, and "P" to mean "even if a failure prevents the first-tier service from connecting to some needed resource". In effect, consistency is sacrificed in order to gain faster responses in a more scalable manner.

CAP is often considered a justification for using weaker consistency models. Popular among these is BASE, an acronym for Basically Available Soft-state services with Eventual-consistency. In summary, the BASE methodology is characterized by high availability for first-tier services, leaving some kind of background cleanup mechanism to resolve any problems created by optimistic actions that later turn out to have violated consistency.

Brewer’s 2012 article

“The ‘2 of 3’ view is misleading on several fronts. First, because partitions are rare, there is little reason to forfeit C or A when the system is not partitioned. Second, the choice between C and A can occur many times within the same system at very fine granularity; not only can subsystems make different choices, but the choice can change according to the operation or even the specific data or user involved. Finally, all three properties are more continuous than binary. Availability is obviously continuous from 0 to 100 percent, but there are also many levels of consistency, and even partitions have nuances, including disagreement within the system about whether a partition exists.”^[4]

“In ACID, the C means that a transaction preserves all the database rules, such as unique keys. In contrast, the C in CAP refers only to single-copy consistency, a strict subset of ACID consistency. ACID consistency also cannot be maintained across partitions; partition recovery will need to restore ACID consistency. More generally, maintaining invariants during partitions might be impossible, thus the need for careful thought about which operations to disallow and how to restore invariants during recovery.”^[4]

References

↑ 1.0 1.1 Seth Gilbert and Nancy Lynch, “Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services”, ACM SIGACT News, Volume 33 Issue 2 (2002), pg. 51-59.
↑ "Brewer's CAP Theorem", julianbrowne.com, Retrieved 02-Mar-2010
↑ "Brewers CAP theorem on distributed systems", royans.net
↑ 4.0 4.1 4.2 4.3 Eric Brewer, “CAP twelve years later: How the "rules" have changed”, IEEE Explore, Volume 45, Issue 2 (2012), pg. 23-29.
↑ Armando Fox and Eric Brewer, “Harvest, Yield and Scalable Tolerant Systems”, Proc. 7th Workshop Hot Topics in Operating Systems (HotOS 99), IEEE CS, 1999, pg. 174-178.
↑ Eric Brewer, "Towards Robust Distributed Systems"
↑ Mark Burgess, "Deconstructing the `CAP theorem' for CM and DevOps"

External links

"A plain english introduction to CAP Theorem" Explains CAP and eventual consistency using examples.
"Consistency Models in Non-Relational Databases" Explains CAP Theorem and eventual consistency in distributed environments.
"Problems with CAP, and Yahoo's little known NoSQL system" Discusses PACELC, an alternative to CAP.
"Returning Transactions to Distributed Data Stores" Discusses the CAP Theorem as it applies to NoSQL systems.
"You can't sacrifice partition tolerance, by Codahale" Discusses practical implications of CAP Theorem in real world systems.

CAP theorem

History

Brewer’s 2012 article

See also

References

External links