Friday, November 25, 2005

Ruminating on Transactions

Most of us have worked on transactions one time or the other. Transactions can be broadly classified as local or distributed.
Local transactions are of the simplest form in which the application peforms CRUD operations with a single datasource/database. The simplest form of relational database access involves only the application, a resource manager, and a resource adapter. The resource manager can be a relational database management system (RDBMS), such as Oracle or SQL Server. All of the actual database management is handled by this component.
The resource adapter is the component that is the communications channel, or request translator, between the "outside world," in this case the application, and the resource manager. In Java applications, this is a JDBC driver.

In a distributed transaction, the transaction accesses and updates data on two or more networked resources, and therefore must be coordinated among those resources.
These resources could consist of several different RDBMSs housed on a single sever, for example, Oracle, SQL Server, and Sybase; or they could include several instances of a single type of database residing on a number of different servers. In any case, a distributed transaction involves coordination among the various resource managers. This coordination is the function of the transaction manager. The transaction manager is responsible for making the final decision either to commit or rollback any distributed transaction. A commit decision should lead to a successful transaction; rollback leaves the data in the database unaltered

The first step of the distributed transaction process is for the application to send a request for the transaction to the transaction manager. Although the final commit/rollback decision treats the transaction as a single logical unit, there can be many transaction branches involved. A transaction branch is associated with a request to each resource manager involved in the distributed transaction. Requests to three different RDBMSs, therefore, require three transaction branches. Each transaction branch must be committed or rolled back by the local resource manager. The transaction manager controls the boundaries of the transaction and is responsible for the final decision as to whether or not the total transaction should commit or rollback. This decision is made in two phases, called the Two-Phase Commit Protocol.

In the first phase, the transaction manager polls all of the resource managers (RDBMSs) involved in the distributed transaction to see if each one is ready to commit. If a resource manager cannot commit, it responds negatively and rolls back its particular part of the transaction so that data is not altered.

In the second phase, the transaction manager determines if any of the resource managers have responded negatively, and, if so, rolls back the whole transaction. If there are no negative responses, the translation manager commits the whole transaction, and returns the results to the application.