Talk:Three-phase commit protocol
From Wikipedia, the free encyclopedia
The protocol presented on the page at present conforms to the Skeen article which actually differs slightly from the description given at [1]. Specifically, the state transition on the cohort from prepared to committed only happens when receiving a commit message from the coordinator in the original article. Was there a change to the protocol in the meantime?
i reformatted the protocol description at the bottom of the page to look similiar to two-phase_commit. hope nobody minds. gba 18:56, 4 March 2006 (UTC)
[edit] Atomicity reliability
Since this is the first time that I post a message to a wikipedia discussion, I won't edit the article myself. I suggest this to the original author. Please correct me if I'm wrong!
You could make the two-phase commit protocol non-blocking in the same way as with the three-phase: by introducing timeouts. The problem with both the blocking and the non-blocking variant is the same: you can never be sure of the atomicity.
Consider the [2]. If Cohort(i) sends an ACK message that gets lost because the link to Cohort breaks, the Cohort will timeout and commit the local transaction, while the Coordinator, not having received the ACK, will timeout and abort. Even if the link gets restored, you can't abort (rollback) later on the commited part on the Cohort.
The basic problem with most kinds of commit protocols is called the Two_Generals'_Problem. If you add more and more layers of acknowledgements (acknowledgements to acknowledgements), the system gets more reliable but never perfect. On the down side, the execution slows more and more.
Regards, Igorecan 13:14, 11 April 2006 (UTC)
Regarding your comments, Igorecan, I have to disagree. In the situation you mention, in 2PC commit with timeouts, this is how I believe it would go (according to my reading of Lampson93):
- Cohort(i) sends an ACK message that gets lost. By the time any cohort can send an ACK, it has already been decided by the coordinator whether the transation is commiting or aborting. So this Cohort is ACKing a commit message in your scenario.
- The Coordinator, knowing that this is a commited transaction, will timeout on Cohort(i)'s response, and will again send a COMMIT message to Cohort(i).
- Cohort(i) upon receiving a COMMIT message for a transaction that has already been commited, will know its ACK message was lost, and will resend the ACK.
- This process might repeat many times until both the COMMIT message and the ACK message were transmited.
Thanks, Nels Beckman 14:28, 7 September 2006 (UTC)