Queries and AMPs | SQL Fundamentals | Teradata Vantage - 17.10 - Queries and AMPs - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL Fundamentals

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B035-1141-171K
Language
English (United States)

An SQL query, which includes DELETE, INSERT, MERGE, and UPDATE as well as SELECT, can affect one AMP, several AMPs, or all AMPs in the configuration.

IF a query … THEN …
(involving a single table) uses a unique primary index (UPI) the row hash can be used to identify a single AMP.

At most one row can be returned.

(involving a single table) uses a nonunique primary index (NUPI) the row hash can be used to identify a single AMP.

Any number of rows can be returned.

uses a unique secondary index (USI) one or two AMPs are affected (one AMP if the subtable and base table are on the same AMP).

At most one row can be returned.

uses a nonunique secondary index (NUSI) if the table has a partitioned primary index (PPI) and the NUSI is the same column set as a NUPI, the query affects one AMP.

Otherwise, all AMPs take part in the operation and any number of rows can be returned.

The SELECT statements in subsequent examples reference the following table data.

Employee  
Employee Number Manager Employee Number Dept. Number Job Code Last Name First Name Hire Date Birth Date Salary Amount
PK/UPI FK FK FK
1006 1019 301 312101 Stein John 961005 631015 2945000
1008 1019 301 312102 Kanieski Carol 970201 680517 2925000
1005 0801 403 431100 Ryan Loretta 1061015 650910 3120000
1004 1003 401 412101 Johnson Darlene 1061015 760423 3630000
1007 1005 403 432101 Villegas Arnando 1050102 770131 4970000
1003 0801 401 411100 Trader James 960731 670619 3755000
1016 0801 302 321100 Rogers Nora 980310 690904 5650000
1012 1005 403 432101 Hopkins Paulene 970315 720218 3790000
1019 0801 301 311100 Kubic Ron 980801 721211 5770000
1023 1017 501 512101 Rabbit Peter 1040301 621029 2650000
1083 0801 619 414221 Kimble George 1010312 810330 3620000
1017 0801 501 511100 Runyon Irene 980501 611110 6600000
1001 1003 401 412101 Hoover William 1010818 700114 2552500

The meanings of the abbreviations are as follows.

Abbreviation Meaning
PK Primary Key
FK Foreign Key
UPI Unique Primary Index

Single AMP Request

Assume that a PE receives the following SELECT statement:

SELECT last_name
FROM Employee
WHERE employee_number = 1008;

Because a unique primary index value is used as the search condition (the column employee_number is the primary index for the Employee table), PE1 generates a single AMP step requesting the row for employee 1008. The AMP step, along with the PE identification, is put into a message, and sent via the BYNET to the relevant AMP (processor).

Flow Diagram of a Single AMP Request

Only one BYNET is shown to simplify the illustration.


Single AMP request with one BYNET

Assuming that AMP2 has the row, it accepts the message. As shown in the next diagram, AMP2 retrieves the row from disk, includes the row and the PE identification in a return message, and sends the message back to PE1 via the BYNET. PE1 accepts the message and returns the response row to the requesting application.

Flow Diagram of a Single AMP Response to Requesting PE

The following diagram illustrates a single AMP request with partition elimination.


Single AMP Response to Requesting PE

All AMP Request

Assume PE1 receives a SELECT statement that specifies a range of primary index values as a search condition as shown in the following example:

SELECT last_name, employee_number
FROM employee
WHERE employee_number BETWEEEN 1001 AND 1010
ORDER BY last_name;

In this case, each value hashes differently, and all AMPs must search for the qualifying rows.

PE1 first parses the request and creates the following AMP steps:
  • Retrieve rows between 1001 and 1010
  • Sort ascending on last_name
  • Merge the sorted rows to form the answer set

PE1 then builds a message for each AMP step and puts that message onto the BYNET. Typically, each AMP step is completed before the next one begins; note, however, that some queries can generate parallel steps.

When PE1 puts the message for the first AMP step on the BYNET, that message is broadcast to all processors as illustrated in the following diagram.


Message from PE1 sent to all processors

The process is:

  1. All AMPs accept the message, but the PEs do not.
  2. Each AMP checks for qualifying rows on its disk storage units.
  3. If any qualifying rows are found, the data in the requested columns is converted to the client format and copied to a spool file.
  4. Each AMP completes the step, whether rows were found or not, and puts a completion message on the BYNET.

    The completion messages flow across the BYNET to PE1.

  5. When all AMPs have returned a completion message, PE1 transmits a message containing AMP Step 2 to the BYNET.

    Upon receipt of Step 2, the AMPs sort their individual answer sets into ascending sequence by last_name, as illustrated in the following diagram.

    If partitioned on employee_number, the scan may be limited to a few partitions based on partition elimination.

    Answer sets sorted by last_name
  6. Each AMP sorts its answer set, then puts a completion message on the BYNET.
  7. When PE1 has received all completion messages for Step 2, it sends a message containing AMP Step 3.
  8. Upon receipt of Step 3, each AMP copies the first block from its sorted spool to the BYNET.

    Because there can be multiple AMPs on a single node, each node might be required to handle sort spools from multiple AMPs (see the following diagram).


    Single node handles sort spools from multiple AMPs
  9. Nodes that contain multiple AMPs must first perform an intermediate sort of the spools generated by each of the local AMPs.

    When the local sort is complete on each node, the lowest sorting row from each node is sent over the BYNET to PE1. From this point on, PE1 acts as the Merge coordinator among all the participating nodes.

  10. The Merge continues with PE1 building a globally sorted buffer.

    When this buffer fills, PE1 forwards it to the application and begins building subsequent buffers.

  11. When a participant node has exhausted its sort spool, it sends a Done message to PE1.

    This causes PE1 to prune this node from the set of Merge participants.

    When there are no remaining Merge participants, PE1 sends the final buffer to the application along with an End Of File message.

Partition Elimination

A PPI can increase query efficiency through partition elimination, where partitions can automatically be skipped because they cannot contain qualifying rows.

The database supports several types of partition elimination.

Type Description
Static Based on constant conditions such as equality or inequality on the partitioning columns.
Dynamic The partitions to eliminate cannot be determined until the query is executed and the data is scanned.
Delayed Occurs with conditions comparing a partitioning column to a USING variable or built-in function such as CURRENT_DATE, where the Optimizer builds a somewhat generalized plan for the query but delays partition elimination until specific values of USING variables and built-in functions are known.
The degree of partition elimination depends on the:
  • Partitioning expressions for the primary index of the table
  • Conditions in the query
  • Ability of the Optimizer to detect partition elimination

It is not always required that all values of the partitioning columns be specified in a query to have partition elimination occur.

IF a query … THEN …
specifies values for all the primary index columns the AMP where the rows reside can be determined and only a single AMP is accessed.
  • If conditions are not specified on the partitioning columns, each partition can be probed to find the rows based on the hash value.
  • If conditions are also specified on the partitioning columns, partition elimination may reduce the number of partitions to be probed on that AMP.
does not specify the values for all the primary index columns an all-AMP full file scan is required for a table with an NPPI.

However, with a PPI, if conditions are specified on the partitioning columns, partition elimination may reduce an all-AMP full file scan to an all-AMP scan of only the non-eliminated partitions.

Single AMP Request With Partition Elimination

If a SELECT specifies values for all the primary index columns, the AMP where the rows reside can be determined and only a single AMP is accessed.

If conditions are also specified on the partitioning columns, partition elimination may reduce the number of partitions to be probed on that AMP.

Suppose the Employee table is defined with a single-level PPI where the partitioning column is dept_number.

Assume that a PE receives the following SELECT statement:

SELECT last_name
FROM Employee
WHERE employee_number = 1023
AND dept_number = 501;

The following flow diagram illustrates this process.


Single AMP request with partition elimination

The AMP Step includes the list of partitions (in this case, P3) to access. Partition elimination (in this case, static partition elimination) reduces access to the partitions that satisfy the query requirements. In each partition in the list (in this case, only P3), look for rows with a given row hash value of the PI.

Partition elimination is similar for the Employee table with a multilevel PPI where one partitioning expression uses the dept_number column and another partitioning expression uses the hire_date column.

Assume that a PE receives the following SELECT statement:

SELECT last_name
FROM Employee
WHERE employee_number = 1023
AND dept_number = 501
AND hire_date BETWEEN DATE '2006-01-01' AND DATE '2006-12-31'

The following flow diagram illustrates this process.


Results of SELECT statement at AMP step

No one was hired in department number 401 in 2006, so partition P4 is empty.

Related Information

For more information on partition elimination, see Teradata Vantage™ - Database Design, B035-1094.