Wednesday, December 14, 2005

Locking in Databases

Databases support different types of locking strategies. We have optimistic locking strategies where concurrency would be more, but data integrity would be less. We have pessimistic locking strategies where concurrency is less, but there is more data integrity. Basically when it comes to locking and transaction isolation levels, its a trade-off between data integrity and concurrency.

Databases supports various levels of lock granularity. The lowest level is a single row. Sometimes, the database doesn't have enough resources to lock each individual
row. In such cases, the database can acquire locks on a single data or index page, group of pages, or an entire table. The granularity of locks depends on the memory available to the database. Servers with more memory can support more concurrent users because they can acquire and release more locks.

We can give locking hints to the database while sending queries that would help us override locking decisions made by the database. For instance, in SQL Server. we can specify the ROWLOCK hint with the UPDATE statement to convince SQL Server to lock each row affected by that data modification.

While dealing with transactions, the type of locking used depends on the Transaction Isolation Level specified for that particular transaction.
SQL Server supports implicit and explicit transactions. By default, each INSERT, UPDATE, and DELETE statement runs within an implicit transaction.Explicit transactions, on the other hand, must be specified by the programmer. Such transactions are included in
BEGIN TRANSACTION ... COMMIT TRANSACTION block

There are two common types of locks present in all databases- Shared locks and Exclusive locks.
Shared locks (S) allow transactions to read data with SELECT statements. Other connections are allowed to read the data at the same time; however, no transactions are allowed to modify data until the shared locks are released.
Exclusive locks (X) completely lock the resource from any type of access including reads. They are issued when data is being modified through INSERT, UPDATE and DELETE statements

Locking/Transaction Isolation levels solve the following concurrency problems:
  • Lost Updates: Lost updates occur when two or more transactions select the same row and then update the row based on the value originally selected. Each transaction is unaware of other transactions. The last update overwrites updates made by the other transactions, which results in lost data.
  • Dirty Read: Uncommitted dependency occurs when a second transaction selects a row that is being updated by another transaction. The second transaction is reading data that has not been committed yet and may be changed by the transaction updating the row.
  • Nonrepeatable Read: Inconsistent analysis occurs when a second transaction accesses the same row several times and reads different data each time.
  • Phantom reads: These occur when an insert or delete action is performed against a row that belongs to a range of rows being read by a transaction. The transaction's first read of the range of rows shows a row that no longer exists in the second or succeeding read, as a result of a deletion by a different transaction. Similarly, as the result of an insert by a different transaction, the transaction's second or succeeding read shows a row that did not exist in the original read.

These concurrency problems can be solved using the correct transaction isolation level in our programs. To balance between high concurrency and data integrity, we can choose from the following isolation levels:

  • READ UNCOMMITTED: This is the lowest level where transactions are isolated only enough to ensure that physically corrupt data is not read. Does not protect against dirty read, phantom read or non-repeatable reads.
  • READ COMMITTED: The shared lock is held for the duration of the transaction, meaning that no other transactions can change the data at the same time. Protects against dirty reads, but the other problems remain.
  • REPEATABLE READ: This setting disallows dirty and non-repeatable reads. But phantom reads are still possible.
  • SERIALIZABLE: This is the highest level, where transactions are completely isolated from one another. Protects against dirty read, phantom read and non-repeatable read.

There are other types of locks available in the database such as 'update lock', 'intent lock', 'schema lock' etc. The update lock prevents deadlocks in the database. For more information about these check out the following links:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/acdata/ac_8_con_7a_6fhj.asp

http://www.awprofessional.com/articles/article.asp?p=26890&rl=1