Database Locks, Two-Phase Locking, Serializability | VantageCloud Lake - Database Locks, Two-Phase Locking, and Serializability - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Database Locks Defined

A lock is a device, typically implemented as a form of semaphore, that relational database management systems use to manage concurrent access to database objects by interleaved transactions running in multiple parallel sessions.

The information contained in a database lock includes the following:
  • Identity of the database object being locked
  • Identity of the transaction holding the lock
  • Lock level
  • Lock severity

The level and severity of a database lock assure the transaction that the objects for which the transaction has requested locks are isolated to the desired level from changes made by concurrently running transactions on those objects.

Database locks can be placed on the individual row partitions of a row-partitioned table and on either the entire table, individual rowhash values within individual row partitions, or on individual row hash values.

Two-Phase Locking Defined

A locking protocol is defined as two-phase if the protocol does not request additional locks for a transaction after releasing the locks it already holds. This locking protocol is the foundation of serializability.

The phases are as follows:
  • The growing phase, during which locks on database objects are acquired.
  • The shrinking phase, during which the previously acquired locks are released.

This is called the Two-Phase Rule.

In practice, the shrinking phase occurs at the point that the transaction commits or finishes rolling back. Now all the database locks are released.

The principal concurrency problems that two-phase locking (2PL) prevents are elaborated by the classic problems of transaction processing, typically named as follows:
  • The lost update problem.
  • The uncommitted dependency problem.
  • The inconsistent analysis problem.
  • The dirty read problem.
  • The deadlock problem (see Deadlock and Proxy Locks.
The following additional problems can also occur:
  • Increased system overhead to administer locking.
  • Decreased concurrency.

    The impact of the lower concurrency that locks introduce is reduced greatly in a system like Vantage that supports multiple levels of locking granularity.

See any standard transaction processing textbook for more information about these concurrency problems.

Serializability Defined

The property of concurrent database accesses by transactions such that any arbitrary serial running of those transactions preserves the integrity of the database is called serializability. The following definition of serializability is equivalent: although a given set of transactions runs concurrently, it appears to each transaction T in the set that the other member transactions runs either before T, or after T, but not both (paraphrased from Gray and Reuter, 1993).

Vantage makes sure serializability for nontemporal transactions if the current isolation level for the session is SERIALIZABLE (see SET SESSION TRANSACTION ISOLATION LEVEL and ACID Properties of Transactions and for details).

For example, suppose table_A is a checking accounts table and table B is a savings accounts table. Suppose one transaction, table_A1, must move 400.00 USD from the checking account of a bank customer to the savings account of the same customer. Suppose another concurrently running transaction, Tx2, performs a credit check on the same bank customer.

For the sake of illustration, assume the three states of the two accounts seen in the following table:
State Checking Account Amount Savings Account Amount
1 $900.00 $100.00
2 $500.00 $100.00
3 $500.00 $500.00
  • State 1 depicts the initial state of the accounts.
  • State 2 depicts an intermediate condition.

    Tx1 has withdrawn $400.00 from the checking account, but has not yet deposited the funds in the savings account.

  • State 3 depicts the final state of the accounts.

    Tx1 has deposited the $400.00 withdrawn from the checking account into the savings account.

Without two-phase locking, Tx2 can read the two accounts at state 2 and conclude that the customer balance is too low to for the purchase for which the credit check was determining the viability. However, this customer still has $1000 in the two accounts and should have qualified.

This is an example of the dirty read phenomenon.

The condition that 2PL assures is serializability. When serializability is in force for nontemporal transactions, the effect of these concurrently running transactions is the same as if the transactions ran one after the other in series.

The two possibilities are as follows:

  1. Tx1 runs.
  2. Tx2 runs.

    In this scenario, Tx2 only sees state 3, so the customer passes the credit check.

  1. Tx2 runs.
  2. Tx1 runs.

    In this scenario, Tx2 only sees state 1, so the customer passes the credit check.

Either scenario can occur, even in a distributed system, if the order is the same everywhere and both scenarios cause a consistent state for the database. Serializability assures only consistent states for the database, not an ordering of transactions.

Two-phase locking of database objects is sufficient, but not necessary, to make sure of serializability of nontemporal transactions.

The term serializable was originally a synonym for the ACID property known as Isolation (see ACID Properties of Transactions and SET SESSION TRANSACTION ISOLATION LEVEL), but the usage of the term isolation in the ANSI SQL standard has been broadened to include the concept of isolation levels, and is no longer synonymous with serializable.

Serializability describes a correct transaction schedule, meaning a schedule whose effect on the database is the same as that of an arbitrary serial schedule.

Changing Transaction Isolation Level for Read-Only Operations

You may be willing to give up a level of transaction isolation in return for better performance. This makes no sense for operations that write data, but permitting dirty read operations can make sense, particularly if you are only interested in gaining a general impression of an aspect of the data rather than getting consistent, reliable, repeatable results.

Examine the overall qualitative workload of the session before deciding whether to use ACCESS-level locking for read-only operations. For example, consider a session in which a MultiLoad IMPORT job is running. Because of the way MultiLoad updates table rows during its application phase (see Teradata® MultiLoad Reference, B035-2409 for details), using ACCESS locks to query the target table of the MultiLoad job during an application phase can produce results that do not provide a reasonable impression of the table data.

Vantage provides methods for allowing the possibility at two different levels: the individual request and the session.

Level Method
Individual request LOCKING Request Modifier.
Session SET SESSION TRANSACTION ISOLATION LEVEL statement.

Retrievals with ACCESS Lock

Because an ACCESS lock is compatible with all locking severities except EXCLUSIVE, a user requesting an ACCESS lock may be allowed to read an object on which a WRITE lock is being held, a situation that is called a dirty read. Therefore, data can be retrieved by an application holding an ACCESS lock while that same data is simultaneously being modified by an application holding a WRITE lock. Therefore, any query that places an ACCESS lock can return incorrect or inconsistent results.

For example, assume that a SELECT request that uses a secondary index constraint is submitted with a LOCKING FOR ACCESS phrase.

If the ACCESS lock is granted on an object being held for WRITE by another user, the column value can change between the time the secondary index subtable key is located and the time the data row is retrieved (such a change is possible because a satisfied SELECT index constraint is not always double-checked against the base data row). This type of inconsistency may occur even if the data is changed only momentarily by a transaction that is later backed out.

The normal policy for ACCESS lock Read operations is that Vantage guarantees to return all rows that were not being updated at the time of the ACCESS lock Read operation to the requestor. For rows that are being updated, rows may be returned, possibly in an inconsistent state, or not returned.

Considerations for Specifying LOCKING FOR ACCESS

A READ lock is typically placed on an object for a SELECT operation, which causes the request to be queued if the object is already locked for WRITE.

If an ad hoc query has no concern for data consistency, the LOCKING request modifier can be used to override the default READ lock with an ACCESS lock. For example:

LOCKING TABLE tablename FOR ACCESS
SELECT ...
FROM tablename ...;

The effect of LOCKING FOR ACCESS is that of reading while writing, so dirty reads can occur with this lock. The best approach to specifying ACCESS locks is to use these locks only when you are interested in a broad, statistical snapshot of the data in question, not when you require precise results.

ACCESS locking can cause incorrect or inconsistent data being returned to a requestor, as detailed in the following points:
  • A SELECT with an ACCESS lock can retrieve data from the target object even when another request is modifying that same object.

    Therefore, results from a request that applies an ACCESS lock can be inconsistent.

  • The possibility of an inconsistent return is especially high when the request applying the ACCESS lock uses a secondary index value in a conditional expression.

    If the ACCESS lock is granted on an object being held for WRITE, the constraint value can change between the time the secondary index subtable is located and the time the data row is retrieved.

    Such a change is possible because a satisfied SELECT index constraint is not always double-checked against the base data row.

The LOCKING ROW request modifier cannot be used to lock multiple row hashes. If LOCKING ROW FOR ACCESS is specified with multiple row hashes, the declaration implicitly converts to LOCKING TABLE FOR ACCESS.

Example: Using LOCKING Request Modifier

The possibility of an inconsistent return is especially high when an ACCESS request uses a secondary index value in a conditional expression, because satisfied index constraints are not always rechecked against the retrieved data row.

For example, assuming that qualify_accnt is defined as a secondary index, the following request can return the result that follows the request text:

LOCKING TABLE accnt_rec FOR ACCESS
SELECT accnt_no, qualify_accnt
FROM accnt_rec
WHERE qualify_accnt = 1587;
Accnt_No  Qualify_Accnt
--------  ------------
    1761          4214

The value 1587 was found in the secondary index subtable, and the corresponding data row was selected and returned. However, the data for account 1761 had been changed by the other user while this selection was in progress.

Returns such as this are possible even if the data is changed or deleted only momentarily by a transaction that is subsequently aborted.

This type of inconsistency can occur even if the data is changed only momentarily by a transaction that is later backed out.