Database Locks Defined
A lock is a device, typically implemented as a form of semaphore, that relational database management systems use to manage concurrent access to database objects by interleaved transactions running in multiple parallel sessions. Among the information contained in a database lock is the identity of the database object being locked, the identity of the transaction holding the object, and its level and severity.
You can think of the level and severity of a database lock as guarantees made to the transaction, assuring that the objects requested to be locked are isolated to the desired level from changes made by other concurrently running transactions on those objects.
Database locks can be placed on the individual row partitions of a row-partitioned table, as well as on either the entire table, individual rowhash values within individual row partitions, or on individual row hash values.
Two-Phase Locking Defined
A two-phase locking protocol does not request additional locks for a transaction after releasing its already held locks. This locking protocol is the foundation of serializability.
- The growing phase, during which locks on database objects are acquired.
- The shrinking phase, during which the previously acquired locks are released.
This is called the Two-Phase Rule.
In practice, the shrinking phase occurs at the point that the transaction commits or finishes rolling back; at this point, all the database locks are released.
- The lost update problem.
- The uncommitted dependency problem.
- The inconsistent analysis problem.
- The dirty read problem.
- The deadlock problem (see Deadlock and Proxy Locks.
- Increased system overhead to administer locking.
- Decreased concurrency.
The impact of the lower concurrency that locks introduce is reduced greatly in a system like Vantage that supports multiple levels of locking granularity.
Consult any standard transaction processing textbook for more information about these concurrency problems.
Serializability Defined
The property of concurrent database accesses by transactions such that any arbitrary serial running of those transactions preserves the integrity of the database is called serializability. In other words, in a set of transactions that run concurrently, each transaction T has the impression that the others ran before or after T, but not at the same time (paraphrased from Gray and Reuter, 1993).
Serializability violations can occur for DML operations for temporal tables. See Teradata Vantage™ - Temporal Table Support, B035-1182 for details and how you can work around it.
Vantage make sure serializability for nontemporal transactions if the current isolation level for the session is SERIALIZABLE (see ACID Properties of Transactions and the SET SESSION CHARACTERISTICS statement in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144 for details).
For example, suppose table A is a checking accounts table and table B is a savings accounts table. Suppose one transaction, Tx1, needs to move 400.00 USD from the checking account of a bank customer to the savings account of the same customer. Suppose another concurrently running transaction, Tx2, performs a credit check on the same bank customer.
State | Checking Account Amount | Savings Account Amount |
---|---|---|
1 | $900.00 | $100.00 |
2 | $500.00 | $100.00 |
3 | $500.00 | $500.00 |
- State 1 depicts the initial state of the accounts.
- State 2 depicts an intermediate condition.
Tx1 has withdrawn $400.00 from the checking account, but has not yet deposited the funds in the savings account.
- State 3 depicts the final state of the accounts.
Tx1 has deposited the $400.00 withdrawn from the checking account into the savings account.
Without two-phase locking, Tx2 can read the two accounts at state 2 and conclude that the customer balance is too low to justify permitting the purchase for which the credit check was determining the viability. But the reality is that this customer still has $1000 in the two accounts and should have qualified.
This is an example of the dirty read phenomenon.
2PL makes sure the transactions are serializable. When serializability is in force for nontemporal transactions, the effect of these concurrently running transactions is the same as if the transactions ran one after the other in series.
The two possibilities are as follows:
- Tx1 runs.
- Tx2 runs.
In this scenario, Tx2 only sees state 3, so the customer passes the credit check.
- Tx2 runs.
- Tx1 runs.
In this scenario, Tx2 only sees state 1, so the customer passes the credit check.
It makes no difference which scenario takes place, even in a distributed system, if the order is the same everywhere and both result in a consistent state for the database. Serializability ensures consistent states for the database, not a specific ordering of transaction execution.
Two-phase locking of database objects is sufficient, but not necessary, to ensure serializability of nontemporal transactions.
The term serializable was originally a synonym for the ACID property known as Isolation (see ACID Properties of Transactions and the SET SESSION CHARACTERISTICS statement in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144), but the usage of the term isolation in the ANSI SQL standard has been broadened to include the concept of isolation levels and is no longer synonymous with serializable.
Serializability describes a transaction schedule, meaning a schedule whose effect on the database is the same as that of an arbitrary serial schedule.
Levels of Isolation Defined
Isolation level is a concept related to concurrently running transactions and how well their updates are protected from one another as a system processes their respective transactions. The ANSI SQL standard formalizes four isolation levels for transactions. To be precise, this section of the standard defines SERIALIZABLE and three weaker, non-serializable isolation levels that permit certain prohibited operation sequences to occur.
The standard collectively refers to these prohibited operation sequences as phenomena. Note that the ANSI isolation levels are defined in terms of these phenomena, not in terms of locking, even though all commercial RDBMSs implement transaction isolation using locks.
The defined phenomena are dirty read, non-repeatable read, and phantom read.
- Read Uncommitted
- Read Committed
- Repeatable Read
The following table shows the relationship between the different isolation levels and the different kinds of prohibited read phenomena.
Isolation Level | Dirty Read | Non-Repeatable Read | Phantom Read |
---|---|---|---|
Read Uncommitted | Possible | Possible | Possible |
Read Committed | Not possible | Possible | Possible |
Repeatable Read | Not possible | Not possible | Possible |
Serializable | Not possible | Not possible | Not possible |
Vantage does not support the isolation levels READ COMMITTED and REPEATABLE READ.
The READ UNCOMMITTED isolation level is implemented using an ACCESS level lock (see the ACCESS entry in the Lock Severity table in Database Locking Levels and Severities and the information about SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144).
Vantage supports load isolation, which uses row versioning. Teradata permits concurrent reads on committed rows even though the underlying table is being modified. The load isolation phenomenon is compatible with the Read Committed isolation level, but not limited to a regular transaction boundary. The phenomenon persists until the load is committed and can occur only on load-isolated tables.
A non-repeatable read or phantom read can occur, because a query that uses the LOAD COMMITTED locking modifier acquires only an ACCESS lock and thus allows a concurrent load-isolated write. If a load commit happens in a concurrent session, then the reader transaction can see modified rows if the same query is issued again even within the same transaction. This is a non-repeatable read phenomenon.
A phantom read phenomenon can occur, wherein a repeated execution of the same query with search conditions after a concurrent load commit causes the different rows to be returned due to newly committed data.
For information about load isolation, see Load Isolation.
Changing the Transaction Isolation Level for Read-Only Operations
You may be willing to give up a level of transaction isolation in return for better performance. This makes no sense for operations that write data, but can make sense to permit dirty read operations, particularly if you are only interested in gaining a general impression of an aspect of the data rather than obtaining consistent, reliable, repeatable results.
This is an important consideration. The overall qualitative workload of the session must be examined carefully before making the determination of whether to use ACCESS-level locking for read-only operations. For example, consider a session in which a MultiLoad IMPORT job is running. Because of the way MultiLoad updates table rows during its application phase (see Teradata® MultiLoad Reference, B035-2409 for details), using ACCESS locks to query the target table of the MultiLoad job during an application phase can produce results too inaccurate to provide even a reasonable impression of the table data.
Vantage provides methods for allowing the possibility at two different levels: the individual request and the session.
Default Read-Only Locking Severity Level | Method |
---|---|
Individual request | LOCKING request modifier. See Teradata Vantage™ - SQL Data Manipulation Language, B035-1146 for details of the syntax and usage of this request modifier. |
Session | SET SESSION CHARACTERISTICS AS
TRANSACTION ISOLATION LEVEL statement. See Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144 for details of the syntax and usage of this statement. |
Note that the global application of ACCESS locking for read operations when SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL is set to READ UNCOMMITED depends on the setting of the DBS Control field AccessLockForUncomRead.
When the field is set FALSE, SELECT operations within INSERT, DELETE, MERGE, and UPDATE requests set READ locks, while when the field is set TRUE, the same SELECT operations set ACCESS locks. For more information, see Teradata Vantage™ - Database Utilities, B035-1102 and the information about SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144.
Retrievals with an ACCESS Lock
Because an ACCESS lock is compatible with all locking severities except EXCLUSIVE, a user requesting an ACCESS lock may be allowed to read an object on which a WRITE lock is being held, a situation that is referred to as a dirty read. This means that data could be retrieved by an application holding an ACCESS lock while that same data is simultaneously being modified by an application holding a WRITE lock. Therefore, any query that places an ACCESS lock can return incorrect or inconsistent results.
For example, assume that a SELECT request that uses a secondary index constraint is submitted with a LOCKING FOR ACCESS phrase.
If the ACCESS lock is granted on an object being held for WRITE by another user, the column value may change between the time the secondary index subtable key is located and the time the data row is retrieved (such a change is possible because a satisfied SELECT index constraint is not always double-checked against the base data row). This type of inconsistency may occur even if the data is changed only momentarily by a transaction that is later backed out.
The normal policy for ACCESS lock Read operations is that Vantage guarantees to return all rows that were not being updated at the time of the ACCESS lock Read operation to the requestor. For rows that are being updated, rows may be returned, possibly in an inconsistent state, or not returned.
Considerations for Specifying LOCKING FOR ACCESS
A READ lock is normally placed on an object for a SELECT operation, which causes the request to be queued if the object is already locked for WRITE.
If an ad hoc query has no concern for data consistency, the LOCKING request modifier can be used to override the default READ lock with an ACCESS lock. For example:
LOCKING TABLE tablename FOR ACCESS SELECT ... FROM tablename ...;
The effect of LOCKING FOR ACCESS is that of reading while writing, so dirty reads can occur with this lock. Specify ACCESS locks only when you are interested in a broad, statistical snapshot of the data in question, not when you require precise results. On load-isolated tables, however, LOCKING FOR LOAD COMMITTED may be used to obtain committed data even while concurrent isolated writes occur simultaneously on the table.
- A SELECT with an ACCESS lock can retrieve data from the
target object even when another request is modifying that same object.
Therefore, results from a request that applies an ACCESS lock can be inconsistent.
- The possibility of an inconsistent return is especially
high when the request applying the ACCESS lock uses a secondary index value
in a conditional expression.
If the ACCESS lock is granted on an object being held for WRITE, the constraint value could change between the time the secondary index subtable is located and the time the data row is retrieved.
Such a change is possible because a satisfied SELECT index constraint is not always double-checked against the base data row.
The LOCKING ROW request modifier cannot be used to lock multiple row hashes. If LOCKING ROW FOR ACCESS is specified with multiple row hashes, the declaration implicitly converts to LOCKING TABLE FOR ACCESS.
Using the LOCKING Request Modifier: An Example
The possibility of an inconsistent return is especially high when an ACCESS request uses a secondary index value in a conditional expression, because satisfied index constraints are not always rechecked against the retrieved data row.
For example, assuming that qualify_accnt is defined as a secondary index, the following request could return the result that follows the request text:
LOCKING TABLE accnt_rec FOR ACCESS SELECT accnt_no, qualify_accnt FROM accnt_rec WHERE qualify_accnt = 1587;
Result:
Accnt_No Qualify_Accnt -------- ------------ 1761 4214
The value 1587 was found in the secondary index subtable, and the corresponding data row was selected and returned. However, the data for account 1761 had been changed by the other user while this selection was in progress.
Returns such as this are possible even if the data is changed or deleted only momentarily by a transaction that is subsequently aborted.
This type of inconsistency can occur even if the data is changed only momentarily by a transaction that is later backed out. Note that for load isolated tables, you can avoid such inconsistency by using the LOCKING FOR LOAD COMMITTED modifier. Refer to The LOCKING FOR LOAD COMMITTED Request Modifier and Teradata Vantage™ - SQL Data Manipulation Language, B035-1146.