Shared nothing architecture
A shared nothing architecture (SN) is a distributed computing architecture in which each node is independent and self-sufficient, and there is no single point of contention across the system. More specifically, none of the nodes share memory or disk storage. People typically contrast SN with systems that keep a large amount of centrally-stored state information, whether in a database, an application server, or any other similar single point of contention.
The advantages of SN architecture versus a central entity that controls the network (a controller-based architecture) include eliminating any single point of failure, allowing self-healing capabilities and providing an advantage with offering non-disruptive upgrades.[1]
History
While SN is best known in the context of web development, the concept predates the web: Michael Stonebraker at the University of California, Berkeley used the term in a 1986 database paper.[2] In it he mentions existing commercial implementations of the architecture (although none are named explicitly). Teradata, which delivered its first system in 1983, was probably one of those commercial implementations.[3] Tandem Computers officially released NonStop SQL, a shared nothing database, in 1984.[4]
Applications
Shared nothing is popular for web development because of its scalability. As Google has demonstrated, a pure SN system can scale almost infinitely simply by adding nodes in the form of inexpensive computers, since there is no single bottleneck to slow the system down.[5] Google calls this sharding. A SN system typically partitions its data among many nodes on different databases (assigning different computers to deal with different users or queries), or may require every node to maintain its own copy of the application's data, using some kind of coordination protocol. This is often referred to as database sharding.
There is some doubt about whether a web application with many independent web nodes but a single, shared database (clustered or otherwise) should be counted as SN. One of the approaches to achieve SN architecture for stateful applications (which typically maintain state in a centralized database) is the use of a data grid, also known as distributed caching. This still leaves the centralized database as a single point of failure.
Shared nothing architectures have become prevalent in the data warehousing space. There is much debate as to whether the shared nothing approach is superior to shared Disk[6] with sound arguments presented by both camps. Shared nothing architectures certainly take longer to respond to queries that involve joins over large data sets from different partitions (machines). However, the potential for scaling is huge.[7]
What is shared?
While there is no single point of contention within the software/hardware components of SN systems, it should be noted that information from disparate nodes may still need to be reintegrated at some point. Such points occur wherever an information system that is outside the SN architecture queries information from disparate nodes within the SN architecture for a single purpose. Examples of such external nodes might be:
- persons (minds) who look at two SN nodes and decide that they hold or process data about the same thing (simply recognising that two nodes belong to the same SN system would be sufficient)
- any software/hardware system that is written to query different nodes within the SN architecture
See also
- Oracle RAC (Shared Everything)
- Ad hoc networking
- Ambient network
- Byzantine fault tolerance
- Comparison of P2P applications
- Computer cluster
- Decentralized computing
- Distributed hash table (DHT)
- EXASOL
- Greenplum
- Grid computing
- Calpont InfiniDB
- MongoDB
- MySQL Cluster
- Openstack
- Overlay network
- Private peer-to-peer
- Sharding
- Swarm intelligence
References
- ↑ "The Advantages of a Shared Nothing Architecture for Truly Non-Disruptive Upgrades". solidfire.com. 2014-09-17. Retrieved 2015-04-21.
- ↑ The Case for Shared Nothing Architecture by Michael Stonebraker. [Originally published in Database Engineering, Volume 9, Number 1 (1986).](PDF)
- ↑ "Teradata History". Teradata.com. Retrieved 2013-06-16.
- ↑ NonStop SQL, A Distributed, High-Performance, High-Availability Implementation of SQL, Tandem Technical Report TR-87.4, http://www.hpl.hp.com/techreports/tandem/TR-87.4.pdf
- ↑ Blankenhorn, Dana (February 27, 2006). "Shared nothing coming to open source". ZDNet. Retrieved June 21, 2012.
- ↑ Independent article comparing Shared Nothing and Shared Disk
- ↑ Article on Shared Nothing from the point of view of a Shared Nothing Vendor(PDF)