Queries and AMPs | SQL Fundamentals | VantageCloud Lake - Queries and AMPs - Teradata VantageCloud Lake

Lake - Working with SQL

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
jbe1714339405530.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
jbe1714339405530

An SQL query, which includes DELETE, INSERT, MERGE, and UPDATE and SELECT, can affect one AMP, multiple AMPs, or all AMPs in the configuration.

Query Effect
Involves a single table and uses a unique primary index (UPI) The row hash can be used to identify a single AMP.

At most one row can be returned.

Involves a single table and uses a nonunique primary index (NUPI) The row hash can be used to identify a single AMP.

Any number of rows can be returned.

Uses a unique secondary index (USI) One or two AMPs are affected (one AMP if the subtable and base table are on the same AMP).

At most one row can be returned.

Uses a nonunique secondary index (NUSI) If the table has a partitioned primary index (PPI) and the NUSI is the same column set as a NUPI, the query affects one AMP.

Otherwise, all AMPs take part in the operation and any number of rows can be returned.

The SELECT statements in subsequent examples reference the following table data.

Employee  
Employee Number Manager Employee Number Dept. Number Job Code Last Name First Name Hire Date Birth Date Salary Amount
PK/UPI FK FK FK
1006 1019 301 312101 Stein John 961005 631015 2945000
1008 1019 301 312102 Kanieski Carol 970201 680517 2925000
1005 0801 403 431100 Ryan Loretta 1061015 650910 3120000
1004 1003 401 412101 Johnson Darlene 1061015 760423 3630000
1007 1005 403 432101 Villegas Arnando 1050102 770131 4970000
1003 0801 401 411100 Trader James 960731 670619 3755000
1016 0801 302 321100 Rogers Nora 980310 690904 5650000
1012 1005 403 432101 Hopkins Paulene 970315 720218 3790000
1019 0801 301 311100 Kubic Ron 980801 721211 5770000
1023 1017 501 512101 Rabbit Peter 1040301 621029 2650000
1083 0801 619 414221 Kimble George 1010312 810330 3620000
1017 0801 501 511100 Runyon Irene 980501 611110 6600000
1001 1003 401 412101 Hoover William 1010818 700114 2552500

The meanings of the abbreviations are as follows.

Abbreviation Meaning
PK Primary Key
FK Foreign Key
UPI Unique Primary Index

Single AMP Request

Assume that a PE gets the following SELECT statement:

SELECT last_name
FROM Employee
WHERE employee_number = 1008;

Because a unique primary index value is used as the search condition (the column employee_number is the primary index for the Employee table), PE1 generates a single AMP step requesting the row for employee 1008. The AMP step, with the PE identification, is put into a message, and sent using the BYNET to the relevant AMP (processor).

Flow Diagram of a Single AMP Request

Only one BYNET is shown to simplify the illustration.


Single AMP request with one BYNET

As shown in the next diagram, AMP2 retrieves the row from disk, includes the row and the PE identification in a return message, and sends the message back to PE1 using the BYNET. PE1 accepts the message and returns the response row to the requesting application.

Flow Diagram of a Single AMP Response to Requesting PE

The following diagram shows a single AMP request with partition elimination.


Single AMP Response to Requesting PE

All AMP Request

Assume PE1 gets a SELECT statement that specifies a range of primary index values as a search condition as shown in the following example:

SELECT last_name, employee_number
FROM employee
WHERE employee_number BETWEEEN 1001 AND 1010
ORDER BY last_name;

Each value hashes differently and all AMPs must search for the qualifying rows.

PE1 first parses the request and creates the following AMP steps:
  • Retrieve rows between 1001 and 1010
  • Sort ascending on last_name
  • Merge the sorted rows to form the answer set

PE1 then builds a message for each AMP step and puts that message onto the BYNET. Typically, each AMP step is completed before the next one begins. However queries may generate parallel steps.

When PE1 puts the message for the first AMP step on the BYNET, that message is broadcast to all processors as the following diagram shows.


Message from PE1 sent to all processors
The process is:
  1. All AMPs accept the message, but the PEs do not.
  2. Each AMP checks for qualifying rows on its disk storage units.
  3. If any qualifying rows are found, the data in the requested columns is converted to the client format and copied to a spool file.
  4. Each AMP completes the step, whether rows were found or not, and puts a completion message on the BYNET.

    The completion messages flow across the BYNET to PE1.

  5. When all AMPs have returned a completion message, PE1 transmits a message containing AMP Step 2 to the BYNET.

    On receipt of Step 2, the AMPs sort their individual answer sets into ascending sequence by last_name, as the following diagram shows.

    If partitioned on employee_number, the scan may be limited based on partition elimination.

    Answer sets sorted by last_name
  6. Each AMP sorts its answer set, then puts a completion message on the BYNET.
  7. After receiving all completion messages for Step 2, PE1 sends a message containing AMP Step 3.
  8. On receipt of Step 3, each AMP copies the first block from its sorted spool to the BYNET.

    Because there can be multiple AMPs on a single node, each node may be required to handle sort spools from multiple AMPs (see the following diagram).


    Single node handles sort spools from multiple AMPs
  9. Nodes with multiple AMPs sort spools generated by local AMPs.

    When the local sort is complete on each node, the lowest sorting row from each node is sent over the BYNET to PE1. From this point on, PE1 acts as the Merge coordinator among all the participating nodes.

  10. The Merge continues with PE1 building a globally sorted buffer.

    When this buffer fills, PE1 forwards the buffer to the application and begins building subsequent buffers.

  11. When a participant node has exhausted its sort spool, the node sends a Done message to PE1.

    This causes PE1 to prune this node from the set of Merge participants.

    When there are no remaining Merge participants, PE1 sends the final buffer to the application with an End Of File message.

Partition Elimination

A PPI can increase query efficiency through partition elimination, where partitions that cannot contain qualifying rows can be skipped.

The database supports the following types of partition elimination:

Type Description
Static Based on constant conditions such as equality or inequality on the partitioning columns.
Dynamic The partitions to eliminate cannot be determined until the query is run and the data is scanned.
Delayed Occurs with conditions comparing a partitioning column to a USING variable or built-in function such as CURRENT_DATE, where the Optimizer builds a generalized plan for the query but delays partition elimination until specific values of USING variables and built-in functions are known.
The degree of partition elimination depends on the:
  • Partitioning expressions for the primary index of the table
  • Conditions in the query
  • Ability of the Optimizer to detect partition elimination
Partition elimination may occur even if a query does not specify all values of the partitioning columns:
  • If a query specifies values for all the primary index columns, the AMP where the rows reside can be determined and only a single AMP is accessed.

    If conditions are not specified on the partitioning columns, each partition can be probed to find the rows based on the hash value.

    If conditions are also specified on the partitioning columns, partition elimination may reduce the number of partitions to be probed on that AMP.

  • If a query does not specify the values for all the primary index columns, an all-AMP full file scan is required for a table with an NPPI.

    However, with a PPI, if conditions are specified on the partitioning columns, partition elimination may reduce an all-AMP full file scan to an all-AMP scan of only the non-eliminated partitions.

Single AMP Request with Partition Elimination

If a SELECT specifies values for all the primary index columns, the AMP where the rows reside can be determined and only a single AMP is accessed.

If conditions are also specified on the partitioning columns, partition elimination may reduce the number of partitions to be probed on that AMP.

Suppose the Employee table is defined with a single-level PPI where the partitioning column is dept_number.

Assume that a PE gets the following SELECT statement:

SELECT last_name
FROM Employee
WHERE employee_number = 1023
AND dept_number = 501;

The following flow diagram shows this process.


Single AMP request with partition elimination

The AMP Step includes the list of partitions (P3) to access. Partition elimination (static partition elimination) reduces access to the partitions that satisfy the query requirements. In each partition in the list (only P3), look for rows with a given row hash value of the PI.

Partition elimination is similar for the Employee table with a multilevel PPI where one partitioning expression uses the dept_number column and another partitioning expression uses the hire_date column.

Assume that a PE gets the following SELECT statement:

SELECT last_name
FROM Employee
WHERE employee_number = 1023
AND dept_number = 501
AND hire_date BETWEEN DATE '2006-01-01' AND DATE '2006-12-31'

The following flow diagram shows this process.


Results of SELECT statement at AMP step

No one was hired in department number 401 in 2006, so partition P4 is empty.