Database Locks Defined
A lock is a device, usually implemented as a form of semaphore, that relational database management systems use to manage concurrent access to database objects by interleaved transactions running in multiple parallel sessions. Among the information contained in a database lock is the identity of the database object it is locking, the identity of the transaction holding it, and its level and severity.
You can think of the level and severity of a database lock as guarantees made to the transaction, assuring it that the objects for which it has requested locks are isolated to the desired level from changes made by other concurrently running transactions on those objects.
Database locks can be placed on the individual row partitions of a row-partitioned table, as well as on either the entire table, individual rowhash values within individual row partitions, or on individual row hash values.
Two-Phase Locking Defined
A locking protocol is defined as two-phase if it does not request additional locks for a transaction after it releases the locks it already holds. This locking protocol is the foundation of serializability.
The phases are as follows:
- The growing phase, during which locks on database objects are acquired.
- The shrinking phase, during which the previously acquired locks are released.
This is sometimes called the Two-Phase Rule.
In practice, the shrinking phase occurs at the point that the transaction commits or finishes rolling back; at this point, all the database locks are released.
Host Utility (HUT) locks are the locks set by the Archive/Recovery utility and the various commands related to it. Unlike database locks, which the system releases automatically, HUT locks are released only when you explicitly specify the RELEASE LOCK option with your Archive/Recovery-related utility command or if your job successfully submits a RELEASE LOCK utility command after the Archive/Recovery-related operation has completed. For more information about HUT locks, see Teradata Archive/Recovery Utility Reference.
The principal concurrency problems that 2PL prevents are elaborated by the classic problems of transaction processing, usually named as follows:
- The lost update problem.
- The uncommitted dependency problem.
- The inconsistent analysis problem.
- The dirty read problem.
- The deadlock problem (see Deadlock and Proxy Locks.
The following additional problems can also occur:
- Increased system overhead to administer locking.
- Decreased concurrency.
The impact of the lower concurrency that locks introduce is reduced greatly in a system like Teradata Database that supports multiple levels of locking granularity.
Consult any standard transaction processing textbook for more information about these concurrency problems.
The property of concurrent database accesses by transactions such that any arbitrary serial execution of those transactions preserves the integrity of the database is called serializability. The following definition of serializability is equivalent: although a given set of transactions executes concurrently, it appears to each transaction T in the set that the other member transactions executed either before T, or after T, but not both (paraphrased slightly from Gray and Reuter, 1993).
Serializability violations can occur for some DML operations for temporal tables. See Temporal Table Support for details and how you can work around it.
Teradata Database ensures serializability for nontemporal transactions as long as the current isolation level for the session is SERIALIZABLE (see ACID Properties of Transactions and the SET SESSION CHARACTERISTICS statement in SQL Data Definition Language for details).
For example, suppose table_A is a checking accounts table and table B is a savings accounts table. Suppose one transaction, Tx 1 , needs to move 400.00 USD from the checking account of a bank customer to the savings account of the same customer. Suppose another concurrently running transaction, Tx 2 , performs a credit check on the same bank customer.
For the sake of illustration, assume the three states of the two accounts seen in the following table:
|State||Checking Account Amount||Savings Account Amount|
- State 1 depicts the initial state of the accounts.
- State 2 depicts an intermediate condition.
Tx 1 has withdrawn $400.00 from the checking account, but has not yet deposited the funds in the savings account.
- State 3 depicts the final state of the accounts.
Tx 1 has deposited the $400.00 withdrawn from the checking account into the savings account.
Without two-phase locking, Tx 2 can read the two accounts at state 2 and come to the conclusion that the customer balance is too low to justify permitting the purchase for which the credit check was determining the viability. But the reality is that this customer still has $1000 in the two accounts and should have qualified.
This is an example of the dirty read phenomenon.
The condition that 2PL ensures is serializability. When serializability is in force for nontemporal transactions, the effect of these concurrently running transactions is the same as what would occur if they ran one after the other in series.
The two possibilities are as follows:
- Tx 1 runs.
In this scenario, Tx 2 only sees state 3, so the customer passes the credit check.
- Tx 2 runs.
In this scenario, Tx 2 only sees state 1, so the customer passes the credit check.
It makes no difference which scenario actually takes place, even in a distributed system, as long as the order is the same everywhere and both result in a consistent state for the database. The important thing to understand from this is that serializability ensures only consistent states for the database, not some particular ordering of transaction execution.
Two-phase locking of database objects is sufficient, but not necessary, to ensure serializability of nontemporal transactions.
The term serializable was originally a synonym for the ACID property known as Isolation (see ACID Properties of Transactions and the SET SESSION CHARACTERISTICS statement in SQL Data Definition Language), but the usage of the term isolation in the ANSI SQL standard has been broadened to include the concept of isolation levels, so it is no longer synonymous with serializable.
Serializability describes a correct transaction schedule, meaning a schedule whose effect on the database is the same as that of some arbitrary serial schedule.
Levels of Isolation Defined
Isolation level is a concept related to concurrently running transactions and how well their updates are protected from one another as a system processes their respective transactions. The ANSI SQL standard formalizes what it refers to as four isolation levels for transactions. To be precise, this section of the standard defines SERIALIZABLE and three weaker, non-serializable isolation levels that permit certain prohibited operation sequences to occur.
The standard collectively refers to these prohibited operation sequences as phenomena. Note that the ANSI isolation levels are defined in terms of these phenomena, not in terms of locking, even though all commercial RDBMSs implement transaction isolation using locks.
The defined phenomena are dirty read, non-repeatable read, and phantom read.
The non-serializable isolation levels ANSI defines are as follows:
- Read Uncommitted
- Read Committed
- Repeatable Read
The following table shows the relationship between the different isolation levels and the different kinds of prohibited read phenomena.
|Isolation Level||Dirty Read||Non-Repeatable Read||Phantom Read|
|Read Committed||Not possible||Possible||Possible|
|Repeatable Read||Not possible||Not possible||Possible|
|Serializable||Not possible||Not possible||Not possible|
Teradata Database does not support the isolation levels READ COMMITTED and REPEATABLE READ.
The READ UNCOMMITTED isolation level is implemented using an ACCESS level lock (see the "ACCESS" entry in the Lock Severity table in Teradata Database Locking Levels and Severities and “SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL” in SQL Data Definition Language).
Teradata Database does support a form of isolation called load isolation, which makes use of row versioning. Teradata permits concurrent reads on committed rows even though the underlying table is being modified. The load isolation phenomenon is compatible with the Read Committed isolation level, but it is not limited to a regular transaction boundary. The phenomenon persists until the load is committed and can occur only on load-isolated tables. A non-repeatable read or a phantom read can occur since a query that uses the LOAD COMMITTED locking modifier only acquires an ACCESS lock and thus allows a concurrent load-isolated write to occur. If a load commit happens in a concurrent session, then the reader transaction can see modified rows if the same query is issued again even within the same transaction. This is a non-repeatable read phenomenon. Additionally, a phantom read phenomenon can occur, wherein a repeated execution of the same query with search conditions after a concurrent load commit causes the different rows to be returned due to newly committed data.
For information about load isolation, see Load Isolation.
Changing the Transaction Isolation Level for Read-Only Operations
Sometimes you might be willing to give up a level of transaction isolation in return for better performance. While this makes no sense for operations that write data, it can sometimes make sense to permit dirty read operations, particularly if you are only interested in gaining a general impression of some aspect of the data rather than obtaining consistent, reliable, repeatable results.
This is a very important consideration, and it should not be taken lightly. The overall qualitative workload of the session must be examined carefully before making the determination of whether to use ACCESS-level locking for read-only operations or not. For example, consider a session in which a MultiLoad IMPORT job is running. Because of the way MultiLoad updates table rows during its application phase (see Teradata MultiLoad Reference for details), using ACCESS locks to query the target table of the MultiLoad job during an application phase can produce extremely inaccurate result sets. In this case, the results probably would not provide even a reasonable impression of the table data.
Teradata Database provides methods for allowing the possibility at two different levels: the individual request and the session.
|TO set the default read-only locking severity for this level …||USE this method …|
|individual request||LOCKING request modifier.
See SQL Data Manipulation Language for details of the syntax and usage of this request modifier.
|session||SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL statement.
See SQL Data Definition Language for details of the syntax and usage of this statement.
Note that the global application of ACCESS locking for read operations when SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL is set to READ UNCOMMITED depends on the setting of the DBS Control field AccessLockForUncomRead.
When the field is set FALSE, SELECT operations within INSERT, DELETE, MERGE, and UPDATE requests set READ locks, while when the field is set TRUE, the same SELECT operations set ACCESS locks. See Utilities and “SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL” in SQL Data Definition Language for details.
Retrievals With an ACCESS Lock
Because an ACCESS lock is compatible with all locking severities except EXCLUSIVE, a user requesting an ACCESS lock might be allowed to read an object on which a WRITE lock is being held, a situation that is referred to as a dirty read. This means that data could be retrieved by an application holding an ACCESS lock while that same data is simultaneously being modified by an application holding a WRITE lock. Therefore, any query that places an ACCESS lock can return incorrect or inconsistent results.
For example, assume that a SELECT request that uses a secondary index constraint is submitted with a LOCKING FOR ACCESS phrase.
If the ACCESS lock is granted on an object being held for WRITE by another user, the column value could change between the time the secondary index subtable key is located and the time the data row is retrieved (such a change is possible because a satisfied SELECT index constraint is not always double-checked against the base data row). This type of inconsistency might occur even if the data is changed only momentarily by a transaction that is later backed out.
The normal policy for ACCESS lock Read operations is that Teradata Database guarantees to return all rows that were not being updated at the time of the ACCESS lock Read operation to the requestor. For rows that are being updated, rows may be returned, possibly in an inconsistent state, or not returned.
This generalizes to ACCESS lock Read operations on a geospatial NUSI as follows: if while attempting to perform an ACCESS locked Read on a geospatial index, Teradata Database discovers that the current state of the Hilbert R-Tree for the index does not permit all rows not being updated to be returned to the requestor, it returns a retryable error to that requestor.
Considerations for Specifying LOCKING FOR ACCESS
A READ lock is normally placed on an object for a SELECT operation, which causes the request to be queued if the object is already locked for WRITE.
If an ad hoc query has no concern for data consistency, the LOCKING request modifier can be used to override the default READ lock with an ACCESS lock. For example:
LOCKING TABLE tablename FOR ACCESS SELECT ... FROM tablename ...;
Be aware that the effect of LOCKING FOR ACCESS is that of reading while writing, so dirty reads can occur with this lock. The best approach to specifying ACCESS locks is to use them only when you are interested in a broad, statistical snapshot of the data in question, not when you require precise results. On load-isolated tables, however, LOCKING FOR LOAD COMMITTED may be used to obtain committed data even while concurrent isolated writes occur simultaneously on the table.
ACCESS locking can result in incorrect or inconsistent data being returned to a requestor, as detailed in the following points:
- A SELECT with an ACCESS lock can retrieve data from the target object even when another request is modifying that same object.
Therefore, results from a request that applies an ACCESS lock can be inconsistent.
If this occurs while you are using an ACCESS lock to read data from a geospatial NUSI column and the current state of the Hilbert R-tree for that NUSI does not permit all of the unmodified rows to be returned to you, Teradata Database returns a retryable error to you.
- The possibility of an inconsistent return is especially high when the request applying the ACCESS lock uses a secondary index value in a conditional expression.
If the ACCESS lock is granted on an object being held for WRITE, the constraint value could change between the time the secondary index subtable is located and the time the data row is retrieved.
Such a change is possible because a satisfied SELECT index constraint is not always double-checked against the base data row.
The LOCKING ROW request modifier cannot be used to lock multiple row hashes. If LOCKING ROW FOR ACCESS is specified with multiple row hashes, the declaration implicitly converts to LOCKING TABLE FOR ACCESS.
Using the LOCKING Request Modifier: An Example
The possibility of an inconsistent return is especially high when an ACCESS request uses a secondary index value in a conditional expression, because satisfied index constraints are not always rechecked against the retrieved data row.
For example, assuming that qualify_accnt is defined as a secondary index, the following request could return the result that follows the request text:
LOCKING TABLE accnt_rec FOR ACCESS SELECT accnt_no, qualify_accnt FROM accnt_rec WHERE qualify_accnt = 1587; Accnt_No Qualify_Accnt -------- ------------ 1761 4214
In this case, the value 1587 was found in the secondary index subtable, and the corresponding data row was selected and returned. However, the data for account 1761 had been changed by the other user while this selection was in progress.
Returns such as this are possible even if the data is changed or deleted only momentarily by a transaction that is subsequently aborted.
This type of inconsistency can occur even if the data is changed only momentarily by a transaction that is later backed out. Note that for load isolated tables, you can avoid such inconsistency by using the LOCKING FOR LOAD COMMITTED modifier. Refer to The LOCKING FOR LOAD COMMITTED Request Modifier and SQL Data Manipulation Language.