Replication (computer science)
From Wikipedia, the free encyclopedia
Replication refers to the use of redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or performance. Replication typically involves replication in space, in which the same data is stored on multiple storage devices or the same computing task is executed on multiple devices, or replication in time, in which a computing task is executed repeatedly on a single device.
Contents |
[edit] Replication in distributed systems
There are two approaches to replication in Distributed Systems, active and passive replication. Active Replication, also known as State Machine Replication, is performed by processing the same request at every replica. In Passive Replication requests are usually processed on a single replica and then the state is transferred to the other replicas. If there is only one machine that processes the requests, then we are talking about the primary-backup scheme. On the other side, if any machine can process a request, then we have a multi-primary scheme. In the multi-primary scheme, some form of Distributed Concurrency Control must be used.
[edit] Database replication
Database replication can be used on many database management systems, usually with a master/slave relationship between the original and the copies. The master logs the updates, which then ripple through to the slaves. The slave outputs a message stating that it has received the update successfully, thus allowing the sending (and potentially re-sending until successfully applied) of subsequent updates. See also Coda and RAID. Multi-master replication, where updates can be submitted to any database node, and then "ripple" through to other servers, is often desired, but introduces substantially increased costs and complexity which may make it impractical in some situations.
[edit] Filesystem replication
Active (real-time) file system replication is usually implemented by distributing updates of a virtual block device to several physical hard disks. This way, any filesystem supported by the operating system can be replicated without modification, as the file system code works on a level above the block device layer. The most popular method for filesystem replication is RAID which is typically limited to locally-connected disks only.
Alternatively, updates to a block device can be replicated (that is, distributed) over a computer network. This has the advantage that the replication slaves can be located in physically distant locations, to avoid damage done by, and improve availability in case of local failures or disasters. An example of this kind of replication is the DRBD module for Linux.
[edit] Distributed shared memory replication
Another example of using replication appears in distributed shared memory systems, where it may happen that many nodes of the system share the same page of the memory - which usually means, that each node has a separate copy (replica) of this page.
[edit] Replication transparency
If a resource is replicated among several locations, it should appear to the user as a single resource.
[edit] See also
- Cluster
- Fault tolerant system
- Object group
- Process group
- Transaction
- Transparency (computing)
- Data replication
[edit] External links
- Article "Practical Considerations in Making CORBA Services Fault-Tolerant" by Priya Narasimhan
- Article "Experiences, Strategies and Challenges in Building Fault-Tolerant CORBA Systems" by Pascal Felber and Priya Narasimhan