Transaction processing

From Wikipedia, the free encyclopedia

In computer science, transaction processing is information processing that is divided into individual, indivisible operations, called transactions. Each transaction must succeed or fail as a complete unit; it cannot remain in an intermediate state.

Contents

[edit] Description

Transaction processing is designed to maintain a database in a known, consistent state, by ensuring that any operations carried out on the database that are interdependent are either all completed successfully or all cancelled successfully.

For example, consider a typical banking transaction that involves moving £500 from a customer's savings account to a customer's checking account. This transaction is a single operation in the eyes of the bank, but it involves at least two separate operations in computer terms: debiting the savings account by £500, and crediting the checking account by £500. If the debit operation succeeds but the credit does not (or vice versa), the books of the bank will not balance at the end of the day. There must therefore be a way to ensure that either both operations succeed or both fail, so that there is never any inconsistency in the bank's database as a whole. Transaction processing is designed to provide this.

Transaction processing allows multiple individual operations on a database to be linked together automatically as a single, indivisible transaction. The transaction-processing system ensures that either all operations in a transaction are completed without error, or none of them are. If some of the operations are completed but errors occur when the others are attempted, the transaction-processing system “rolls back” all of the operations of the transaction (including the successful ones), thereby erasing all traces of the transaction and restoring the database to the consistent, known state that it was in before processing of the transaction began. If all operations of a transaction are completed successfully, the transaction is “committed” by the system, and all changes to the database are made permanent; the transaction cannot be rolled back once this is done.

Transaction processing guards against hardware and software errors that might leave a transaction partially completed, with a database left in an unknown, inconsistent state. If the computer system crashes in the middle of a transaction, the transaction processing system guarantees that all operations in any uncommitted (i.e., not completely processed) transactions are cancelled.

Transactions are processed in a strict chronological order. If transaction n+1 touches the same portion of the database as transaction n, transaction n+1 does not begin until transaction n is committed. Before any transaction is committed, all other transactions affecting the same part of the database must also be committed; there can be no “holes” in the sequence of preceding transactions.

[edit] Methodology

The basic principles of all transaction-processing systems are the same. However, the terminology may vary from one transaction-processing system to another, and the terms used below are not necessarily universal.

[edit] Rollback

Transaction-processing systems ensure database integrity by recording intermediate states of the database as it is modified, then using these records to restore the database to a known state if a transaction cannot be committed. For example, copies of information on the database prior to its modification by a transaction are set aside by the system before the transaction can make any modifications (this is sometimes called a before image). If any part of the transaction fails before it is committed, these copies are used to restore the database to the state it was in before the transaction began (rollback).

[edit] Rollforward

It is also possible to keep a separate journal of all modifications to a database (sometimes called after images); this is not required for rollback of failed transactions, but it is useful for updating the database in the event of a database failure, so some transaction-processing systems provide it. If the database fails entirely, it must be restored from the most recent back-up. The back-up will not reflect transactions committed since the back-up was made. However, once the database is restored, the journal of after images can be applied to the database (rollforward) to bring the database up to date. Any transactions in progress at the time of the failure can then be rolled back. The result is a database in a consistent, known state that includes the results of all transactions committed up to the moment of failure.

[edit] Deadlocks

In some cases, two transactions may, in the course of their processing, attempt to access the same portion of a database at the same time, in a way that prevents them from proceeding. For example, transaction A may access portion X of the database, and transaction B may access portion Y of the database. If, at that point, transaction A then tries to access portion Y of the database while transaction B tries to access portion X, a deadlock occurs, and neither transaction can move forward. Transaction-processing systems are designed to detect these deadlocks when they occur. Typically both transactions will be cancelled and rolled back, and then they will be started again in a different order, automatically, so that the deadlock doesn't occur again. Or sometimes, just one of the deadlocked transactions will be cancelled, rolled back, and automatically re-started after a short delay.

Deadlocks can also occur between three or more transactions. The more transactions involved, the more difficult they are to detect, to the point that transaction processing systems find there is a practical limit to the deadlocks they can detect.

[edit] ACID criteria

Main article: ACID

There are many minor variations on the exact methods used to protect database consistency in a transaction-processing system, but the basic principles remain the same. All transaction-processing systems support these functions, which are referred to as the ACID properties: atomicity, consistency, isolation, and durability.

[edit] Implementations

Standard transaction-processing software, notably IBM's Information Management System, was first developed in the 1960s, and was often closely coupled to particular database management systems. Client-server computing implemented similar principles in the 1980s with mixed success. However, in more recent years, the distributed client-server model has become considerably more difficult to maintain. As the number of transactions grew in response to various online services (especially the Web), a single distributed database was not a practical solution. In addition, most online systems consist of a whole suite of programs operating together, as opposed to a strict client-server model where the single server could handle the transaction processing. Today a number of transaction processing systems are available that work at the inter-program level and which scale to large systems, including mainframes.

An important open industry standard is the X/Open Distributed Transaction Processing (DTP) (see JTA). However, proprietary transaction-processing environments such as IBM's CICS are still very popular, although CICS has evolved to include open industry standards as well.

[edit] See also

[edit] Books

  • Jim Gray, Andreas Reuter, Transaction Processing - Concepts and Techniques, 1993, Morgan Kaufmann, ISBN 1-55860-190-2
  • Philip A. Bernstein, Eric Newcomer, Principles of Transaction Processing, 1997, Morgan Kaufmann, ISBN 1-55860-415-4
  • Ahmed K. Elmagarmid (Editor), Transaction Models for Advanced Database Applications, Morgan-Kaufmann, 1992, ISBN 1-55860-214-3