Data synchronization
From Wikipedia, the free encyclopedia
Data synchronization is the process of establishing consistency among data on remote sources. It is fundamental to a wide variety of applications, including file synchronization,[1] Personal Digital Assistant synchronization,[2] and Public Key Server synchronization.[3]
Contents |
[edit] Theoretical models
Several theoretical models of data synchronization exist in the research literature, and the problem is also related to problem of Slepian-Wolf coding in information theory. The models are classified based on how they consider the data to be synchronized.
[edit] Unordered data
The problem of synchronizing unordered data (also known as the set reconciliation problem) is modeled as an attempt to compute the symmetric difference between two remote sets SA and SB of b-bit numbers.[4] Some solutions to this problem are typified by:
- Wholesale transfer
- In this case all data is transferred to one host for a local comparison.
- Timestamp synchronization
- In this case all changes to the data are marked with timestamps. Synchronization proceeds by transferring all data with a timestamp later than the previous synchronization.[5]
- Mathematical synchronization
- In this case data are treated as mathematical objects and synchronization corresponds to a mathematical process.[6][7][8]
[edit] Ordered data
In this case, two remote strings σA and σB need to be reconcilied. Typically, it is assumed that these strings differ by up to a fixed number of edits (i.e. character insertions, deletions, or modifications). Some solution approaches to this problem include:
- rsync
- shingling - splitting the strings into shingles in order to reduce this problem into an unordered synchronization problem.[9]
[edit] Notes
- ^ A. Tridgell. Efficient algorithms for sorting and synchronization. PhD thesis, The Australian National University, 2000.
- ^ S. Agarwal, D. Starobinski, and A. Trachtenberg, On the Scalability of Data Synchronization Protocols for PDAs and Mobile Devices, IEEE Network Magazine: Scalability in Communication Networks, July 2002.
- ^ sks.dnsalias.net
- ^ Y. Minsky, A. Trachtenberg, and R. Zippel. Set reconciliation with nearly optimal communication complexity. IEEE Trans. on Info. Theory, September 2003.
- ^ Palm developer knowledgebase manuals
- ^ Y. Minsky, A. Trachtenberg, and R. Zippel. Set reconciliation with nearly optimal communication complexity. IEEE Trans. on Info. Theory, September 2003.
- ^ A. Trachtenberg, D. Starobinski, and S. Agarwal, Fast PDA Synchronization Using Characteristic Polynomial Interpolation, IEEE INFOCOM 2002
- ^ Y. Minsky and A. Trachtenberg, Scalable set reconciliation, Allerton Conference on Communication, Control, and Computing, Oct. 2002
- ^ S. Agarwal, V. Chauhan, and A. Trachtenberg. Bandwidth efficient string reconciliation using puzzles. IEEE Transactions on Parallel and Distributed Systems, 17(11):1217{1226, November 2006.