Usage Notes - Teradata Database

Uses of the SELECT Statement

SELECT is the most frequently used SQL statement. Use SELECT statements to specify any of the following:

The set of result data that is returned
The order in which result sets are returned
How the result sets should be grouped
Whether the result sets should be precisely determined or randomly sampled
The format in which the results set is reported

Specify fully qualified names in SELECT statements that reference objects in multiple databases and users. Name resolution problems can occur when databases and users referenced in a query contain tables or views with identical names, and those objects are not fully qualified. Name resolution problems can occur even if the identically named objects are not referenced in the query.

SELECT Subqueries

A subquery is a SELECT expression that is nested within another SQL statement or expression. You can specify SELECT expressions as subqueries in a main query, an outer query, or another subquery for these DML statements:

ABORT (see “ABORT”)
DELETE (see “DELETE”)
INSERT (see INSERT/INSERT … SELECT )
MERGE (see MERGE)
ROLLBACK (see “ROLLBACK”)
SELECT
UPDATE (see “UPDATE”)

The upsert form of UPDATE does not support subqueries as WHERE clause conditions. For more information about the upsert form of the UPDATE statement, see “UPDATE (Upsert Form)”.

The syntax rules require that if there are any subqueries in the statement, then any tables referenced in the SELECT must be specified in a FROM clause, either of the subquery itself or of some outer query that contains the subquery.

Correlated subqueries are another category of subquery. A subquery is correlated when it references columns of outer tables in an enclosing or containing, outer query. Correlated subqueries provide an implicit loop function within any standard SQL DML statement.

For details on correlated subqueries and their use, see Correlated Subqueries.

For simple examples of correlated subqueries, see Example: SELECT Statements With a Correlated Subquery.

Scalar subqueries are another category of subquery that can be coded either as a correlated subquery or as a noncorrelated subquery. A correlated scalar subquery returns a single value for each row of its correlated outer table set. A noncorrelated scalar subquery returns a single value to its containing query. For information about scalar subqueries, see Scalar Subqueries.

Locks and Concurrency

A DML SELECT sets a default READ lock on the tables or rows referenced in the SELECT statement, depending on session isolation level (refer to the table, below) and the locking level the Lock Manager imposes or the locking level you specify using the LOCKING modifier. See “LOCKING Request Modifier”. If the SELECT statement references a view, then the system places a READ lock on each of its underlying tables.

This topic does not apply to DDL statements such as CREATE JOIN INDEX and CREATE VIEW that nest SELECT subqueries in their object definition statements.

For outer SELECT statements and SELECT subqueries that are not nested within a DML statement that manipulates data using a DELETE, INSERT, MERGE, or UPDATE statement, the default locking severity is always READ whether the session transaction isolation level is SERIALIZABLE.

If the session transaction isolation level is READ UNCOMMITTED and the DBS Control field AccessLockForUncomRead is set to TRUE, then the default locking level for SELECT operations depends on depends on the factors explained in the following paragraph and table.

For SELECT subqueries that are nested within a DML statement that manipulates data using a DELETE, INSERT, MERGE, or UPDATE statement, the default locking severity is ACCESS when the DBS Control field AccessLockForUncomRead is set TRUE and the transaction isolation level for the session is READ UNCOMMITTED, as the following table indicates.

Transaction Isolation Level	DBS Control AccessLockForUncomRead Field Setting	Default Locking Severity for Outer SELECT and Ordinary SELECT Subquery Operations	Default Locking Severity for SELECT Operations Embedded Within DELETE, INSERT, MERGE, or UPDATE Statements
SERIALIZABLE	FALSE	READ	READ
SERIALIZABLE	TRUE		READ
READ UNCOMMITTED	FALSE		READ
READ UNCOMMITTED	TRUE		ACCESS

When a SELECT statement does not specify a LOCKING modifier, but a view it uses to access tables does, Teradata Database places the lock specified by the view and does not comply with the defaults described in the preceding table.

For example, suppose you create this view.

     CREATE VIEW access_view (a, b) AS
     LOCKING TABLE accntrec FOR ACCESS
     SELECT accntno, qualifyaccnt
     FROM accntrec
     WHERE qualifyaccnt = 1587;

This SELECT statement places an ACCESS lock on table accntrec regardless of the specified session isolation level and setting of the AccessLockForUncomRead field.

     SELECT a, b
     FROM access_view;

For details, see “SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL” in SQL Data Definition Language .

Derived Tables

A derived table is a transient table that is created dynamically from one or more tables by evaluating a query expression or table expression specified as a subquery within a SELECT statement. You must specify a correlation name for each derived table you create.

The semantics of derived tables and views are identical. For details, see “CREATE VIEW” in SQL Data Definition Language .

NoPI Tables and SELECT Statements

Because NoPI tables do not have a primary index, single-AMP primary index access is not available. The table must be accessed using a full-table scan, unless:

the table has a secondary or join index that the Optimizer can use for the access plan, or
column or row partition elimination can be applied for a column-partitioned table.

Secondary and join index access paths work the same for NoPI tables as for primary index tables.

SELECT Statements in Embedded SQL

The rules and restrictions are:

SELECT must be performed from within a Selection DECLARE CURSOR construct.
For updatable cursors opened for SELECT statements only, no ORDER BY, GROUP BY, HAVING, or WITH... BY clause is allowed.
The SELECT privilege is required on all tables specified in the FROM clause and in any subquery contained in the query specification, or on the database set containing the tables.
The number of columns in the select list must match the number of host variables.
Teradata Database assigns values to the host variables specified in the INTO clause in the order in which the host variables are specified. The system assigns values to the status parameters SQLSTATE and SQLCODE last.
The main host variable and the corresponding column in the returned data set must be of the same data type group, except that if the main host variable has an approximate type, then the temporary table column data type can have either an approximate numeric or exact numeric type.
If the temporary table is empty, then Teradata Database assigns these values to the status parameters:

Status Parameter	Assigned Value
SQLSTATE	‘02000’
SQLCODE	+100

Values are not assigned to the host variables specified in the INTO clause.

If exactly one row of data is returned, the values from the row are assigned to the corresponding host variables specified in the INTO clause.
If more than one row of data is returned, the system assigns these values to the status parameters:

Status Parameter	Assigned Value
SQLSTATE	‘21000’
SQLCODE	-811

Values are not assigned to the host variables specified in the INTO clause.

If an error occurs in assigning a value to a host variable, the system assigns one of these value sets to the status parameters:

Status Parameter	Assigned Value
SQLSTATE	‘22509’
SQLCODE	-303
SQLSTATE	‘22003’
SQLCODE	-304
SQLSTATE	‘22003’
SQLCODE	-413

Values are not assigned to the host variables specified in the INTO clause.

If a column value in the returned data is NULL and a corresponding indicator host variable is specified, the value -1 is assigned to the indicator host variable and no value is assigned to the main host variable.
If no corresponding indicator host variable is specified, then Teradata Database assigns these values to the status parameters:

Status Parameter	Assigned Value
SQLSTATE	‘22002’
SQLCODE	-305

Values are not assigned to the host variables specified in the INTO clause.

If a column value in the returned data is NOT NULL and a corresponding indicator host variable is specified, the indicator host variable is set as follows:
- If the column and main host variable are of CHARACTER data type and the column value is longer than the main host variable, the indicator host variable is set to the length of the column value.
- In all other cases, the indicator variable is set to zero.
Status parameters have the following default values:

Status Parameter	Assigned Value
SQLSTATE	‘00000’
SQLCODE	0

Column values are set in the corresponding main host variables according to the rules for host variables.
SELECT … INTO cannot be performed as a dynamic request.

DEFAULT Function in SELECT Statements

The rules and restrictions are:

The DEFAULT function accepts a single expression that identifies a relation column by name. The function evaluates to a value equal to the current default value for the column. For cases where the default value of the column is specified as a current built-in system function, the DEFAULT function evaluates to the current value of system variables at the time the request is executed.
The resulting data type of the DEFAULT function is the data type of the constant or built-in function specified as the default unless the default is NULL. If the default is NULL, the resulting date type of the DEFAULT function is the same as the data type of the column or expression for which the default is being requested.
The DEFAULT function can be specified as DEFAULT or DEFAULT( column_name ) . When column name is not specified, the system derives the column based on context. If the column context cannot be derived, the system returns an error.
You can specify a DEFAULT function with a column name in the select list of a SELECT statement. The DEFAULT function evaluates to the default value of the specified column.
The DEFAULT function must include a column name in the expression list.
You can determine the default value for a column by selecting it without specifying a FROM clause for the statement.
When you specify a SELECT statement that does not also specify a FROM clause, the system always returns only a single row with the default value of the column, regardless of how many rows are in the table. This is similar to the behavior of the TYPE function.
When the column passed as an input argument to the DEFAULT function does not have an explicit default value associated with it, the DEFAULT function evaluates to null.

For more information about using DEFAULT within the SELECT statement, see:

Specifying UDTs in an SQL Request

The rules and restrictions are:

A UDT must have its tosql and fromsql transform functionality defined prior to its use as a column data type for any table. See “CREATE TRANSFORM” in SQL Data Definition Language .
You can specify an expression that returns a UDT in a column list only if its transform group has a fromsql routine. The system automatically converts the expression from its UDT value to the external type via the fromsql routine before returning it to a client application.
The only difference between distinct and structured types in this regard is that Teradata Database generates a transform group with both fromsql and tosql routines by default for distinct types, and its external data type is the source data type, while you must explicitly create the transform group for a structured type. For details on how to create a transform group for a UDT, see “CREATE TRANSFORM” in SQL Data Definition Language .
If you submit a SELECT statement with no table references that contains a UDT expression in its select list, and you have not declared transform functionality for it using a UDT literal declared using a NEW expression, then the request aborts and the system returns an error. For information about the NEW expression, see SQL Functions, Operators, Expressions, and Predicates .

Invoking a Scalar UDF From a SELECT Statement

You must use the alias to reference the result of a scalar UDF invocation that has an alias.

Returning a Varying Column Table External UDF Result

The rules and restrictions are:

You can only include table functions in the FROM clause. See FROM Clause.
The system processes table functions like derived tables.
The SELECT statement syntax supports the implicit specification of the number of return columns at runtime in its TABLE (function_name RETURNS … table_name) clause.
When you reference a table function that does not have a variable number of output columns, the system obtains its explicit output column definitions from DBC.TVFields.
However, when you reference a table function that outputs a dynamic number of columns, you must specify either an output column list or the name of an existing table as part of the function invocation using the RETURNS … table_name option in the FROM clause of the SELECT request.

IF you specify …	THEN the system uses the columns …
an outcome column name list	specified in the list as the function output columns.
the name of an existing table	from the definition for that table as the function output columns.

In either case, the number of columns specified cannot exceed the defined maximum number of output columns specified by the VARYING COLUMNS maximum_output_columns specification of its CREATE FUNCTION (Table Form) definition.

See “CREATE FUNCTION (Table Form)” in SQL Data Definition Language .

To invoke a varying column table function, you can:
- Specify a list of return column name-data type pairs.
- Specify the return columns from an existing table.
  For examples of both forms returning the same result, see Example: Dynamic Row Results Returned by Specifying Table Functions.
You cannot specify LOB columns using the BLOB AS LOCATOR or CLOB AS LOCATOR forms.

SELECT Statements and Derived Period Columns

A SELECT * on a table with a derived period column returns the two DATE or TIMESTAMP columns used to define the derived period column, not the derived period column. The two DATE or TIMESTAMP columns appear as regular columns in a nontemporal table.

You can only specify derived period columns with period predicate operators, begin and end bound functions, duration functions, comparison operators, logical predicate operators, and in the EXPAND clause. For example, you cannot specify derived period columns in WITH or HAVING clauses. You cannot specify derived period columns in functions, macros, or stored procedures such as P_INTERSECT, P_NORMALIZE, LDIFF, or RDIFF, except in a WHERE clause.

You cannot specify derived period columns in the select list.

SELECT and INSERT-SELECT Statements with Set Operators

Only the SELECT and INSERT-SELECT SQL statements can use the set operators UNION, INTERSECT, and MINUS/EXCEPT. The set operators enable you to manipulate the answers to two or more queries by combining the results of each query into a single result set.

The set operators can also be used within these operations:

View definitions
Derived tables
Subqueries
INSERT … SELECT statements

You cannot use the set operators with LOB columns.

Joins

A SELECT statement can reference data in one or more sets of tables and views, including a mix of tables and views. The SELECT statement can define a join of tables or views.

You can join tables and views that have row-level security constraints, as long as the tables and views being joined have identical row-level security constraints. Otherwise, the system returns an error.

For more information about join operations, see “Chapter 2 Join Expressions”.

For information about how the Optimizer processes joins, see SQL Request and Transaction Processing .

Activity Count Indicates the Number of Rows Returned

When the result of a SELECT statement is returned, the activity count success response indicates the number of rows returned. If no rows are returned, then the activity count is 0.

SELECT and Queue Tables

A queue table is similar to an ordinary base table, with the additional unique property of being similar to an asynchronous first-in-first-out (FIFO) queue.

The first column of a queue table always contains Queue Insertion TimeStamp (QITS) values. The QITS value of a row indicates the time the row was inserted into the queue table, unless a different, user-supplied value is inserted.

The CREATE TABLE statement for the table must define the first column with the following data type and attributes:

     TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6)

You can use SELECT statements to perform a FIFO peek on a queue table. In this case, data is returned from the queue table without deleting any of its rows.

If no rows are available, no rows are returned. The transaction does not enter a delay state.

To return the queue in FIFO order, specify the first column of the queue table in the ORDER BY clause of the SELECT statement.

To perform a FIFO pop on a queue table, use a SELECT AND CONSUME statement. For more information, see “SELECT AND CONSUME”.

A SELECT AND CONSUME statement cannot specify a scalar subquery.

For more information about how to define a queue table, see “CREATE TABLE (Queue Table Form)” in SQL Data Definition Language .

Logging Problematic Queries

The recommended best practice for logging problematic DML requests (queries) is to specify XMLPLAN logging whenever you submit the queries. The XMLPLAN option captures (logs) the query plan generated by the Optimizer as an XML document that can be used by Teradata support tools to diagnose performance issues with the query. The query plan is stored in the DBC.DBQLXMLTbl system table. For information on how to capture query plans for DML requests using the XMLPLAN option, see "Using BEGIN QUERY LOGGING to Log Query Plan Information" in SQL Data Definition Language Detailed Topics. For information about the BEGIN QUERY LOGGING statement syntax and descriptions of the statement options, see SQL Data Definition Language Syntax and Examples.