16.10 - Localizing the Work - Teradata Database

Teradata Database Design

Product
Teradata Database
Release Number
16.10
Release Date
June 2017
Content Type
User Guide
Publication ID
B035-1094-161K
Language
English (United States)

After an SQL statement is optimized, the Dispatcher sends the individual steps that make up that query, one at a time, to some number of AMPs in the system. Steps are usually dispatched, and processes started, on all AMPs in the system. Under some conditions, the Optimizer might decide that this query work can be accomplished by accessing only a single AMP in the configuration. This is referred to as a single-AMP operation, or a single-AMP step.

Teradata Database uses single-AMP steps to localize work to the fewest resources necessary when it is beneficial to do so. This is frequently the case for tactical queries. Single-AMP work frees the other AMPs in the system to do other work, thereby increasing the potential for overall system throughput. This is diagrammed in the following figure:



Tactical queries are most efficient and most scalable when designed around single- or few-AMP operations. This is the single most important way to increase throughput and preserve response times for very short tactical queries. This is diagrammed in the following figure:



Single- or few-AMP queries scale well because they engage the same small level of database resources even when the system doubles or triples in size (see Effect of Configuration Expansion on Tactical Query Response Times).

Single-AMP Operations Using a Primary Index

Single-AMP operations use a primary index value to locate a row.



Single-AMP operations can be achieved by any of the following data manipulation operations:

  • Simple single-row inserts
  • Simple selects, updates and deletes qualified with a primary index value
  • Joins between two tables that share one of the following characteristics:
    • the same primary index domain
    • a primary index making up the join constraint
    • a single primary index value is specified for the join

The following EXPLAIN report is for a query that accesses the supplier table using the single primary index value s_suppkey = 583. Only a single AMP is engaged, as demonstrated by the single-AMP RETRIEVE step reported in step 1 of the EXPLAIN text:

     EXPLAIN
     SELECT s_name, s_acctbal
     FROM supplier
     WHERE s_suppkey = 583;

Explanation
------------------------------------------------------------------------
  1) First, we do  a single-AMP RETRIEVE step  from TPCD50G.supplier by
     way of the unique primary index "TPCD50G.supplier.S_SUPPKEY = 583"
     with no residual conditions. The estimated time for this step is
     0.03 seconds.

Notice that there are no references to locking in the EXPLAIN report for this query. That is because the Optimizer has folded the locking activity (in this case a single row hash READ lock) into the same step that retrieves the row. This sort of lock folding is done only with row hash locks.

This special shortcut for handling row hash locks eliminates the need for the Dispatcher to dispatch a separate locking step when only one AMP and one row are involved. This reduces the PE-to-AMP communication effort.

Two-AMP Operations: USI Access and Tactical Queries

When an application specifies a value that can be used to access a table using its USI, a 2-AMP operation results. Note that USI access can be a single-AMP operation if the USI value for a row happens to hash to a subtable on the same AMP as the primary index for the same row, but is never more than a 2-AMP operation.



In the following query, the column s_name is defined as a USI on the original supplier table having s_suppkey as its UPI:

     EXPLAIN
     SELECT s_suppkey, s_acctbal
     FROM supplier
     WHERE s_name = 'Supplier#000038729';

Explanation
------------------------------------------------------------------------
  1) First, we do  a two-AMP RETRIEVE step  from CAB.supplier by way
     of unique index # 8 "CAB.supplier.S_NAME = 'Supplier#000038729'" 
     with no residual conditions. The estimated time for this step is 
     0.07 seconds.

NUSI Access and Tactical Queries

Compare the previous example of USI access to access using a NUSI. With NUSIs, each AMP contains an index structure for the base table rows that it owns. Even if only a single row contains the specified NUSI value, the Optimizer cannot know that a priori, so the NUSI subtables for all AMPs are always searched, making NUSI accesses all-AMPs operations. The only exceptions are when a NUSI is on the same columns as the primary index (such as when the primary index is row-partitioned) or the NUSI is on the same columns as the primary AMP index. In these cases, only a single-AMP search is required.



The following example assumes the s_name column is defined as a NUSI on supplier. Notice the difference between this EXPLAIN report and the previous one for the USI (see Two-AMP Operations: USI Access and Tactical Queries):

     EXPLAIN
     SELECT s_suppkey, s_acctbal
     FROM supplier
     WHERE s_name = 'SUPPLIER#000000647';

     Explanation
     ------------------------------------------------------------------
       1) First, we lock TPCD50G for read on a RowHash (proxy lock)
          to prevent global deadlock for TPCD50G.supplier.
       2) Next,  we lock TPCD50G.supplier for read.
       3) We do  an all-AMPs RETRIEVE step  from TPCD50G.supplier by way
          of index # 8 "TPCD50G.supplier.S_NAME = 'SUPPLIER#000000647'" with
          no residual conditions into Spool 1, which is built locally on the
          AMPs. The size of Spool 1 is estimated with high confidence to be
          1 row. The estimated time for this step is 0.20 seconds.
       4) Finally, we send out an END TRANSACTION step to all AMPs involved
          in processing the request.

With NUSI access, table-level locks are applied and an all-AMPs operation is performed. Value-ordered indexes and covering indexes are variants of a NUSI. Because of that, they also require all-AMPs operations that are similar to NUSI access.

Group AMP Operations

Sometimes less parallelism is best. Tactical queries, for example, are more efficient when they engage fewer resources. On the other hand, short queries often require more than one AMP and cannot be accommodated with single-AMP processing alone. The Optimizer looks for opportunities to transform what would otherwise be all-AMP query plan into a few-AMP plan. This few-AMPs approach is called Group AMP.

The Group AMP approach not only reduces the number of AMPs active in supporting a query, it also reduces the locking level from a table-level lock to one or more rowkey (specifically, partition and rowhash) locks in each AMP in the group. Removing the need for table-level locks eliminates two all-AMP steps from the query plan:

  • A step to place the table-level lock on all AMPs.
  • A step to remove the table-level lock from all AMPs when the query completes.

Determining When the Optimizer Will Consider Group AMP Processing

The Optimizer considers several criteria when it determines whether a query is a candidate for Group AMP processing or not:

  • How many AMPs needed to satisfy the request?

    For most systems this number of participating AMPs must be 50% or fewer of the total number of AMPs configured in the system. Statistics collected on the selection and join columns being referenced in the query help the Optimizer make the correct assessment. If no statistics for these columns exist, the Group AMP option is less likely to be chosen. This makes the regular collection of statistics a critical prerequisite for gaining the advantages of the Group AMP feature.

  • Can the query be driven from a single-AMP or Group AMP step as the first meaningful database access step?

    Because NUSI access is always an all-AMP activity, accessing rows by means of a NUSI as the first step in the query plan eliminates Group AMP considerations later in the plan, even if the NUSI access returns only one row from one AMP.

  • Does a product join based on table duplication appear in any step in the query plan?

    If so, the Group AMP option is not considered. Under this condition, a product join is always an all-AMP operation. The existence of a single all-AMP operation in the plan negates the possibility of using a Group AMP later.

Few-AMP Joins and Tactical Queries

Even when joins must be made to process a tactical query, it is possible to make the join with few-AMP operations. The following picture illustrates the combinations of accessing and joining that can be performed engaging one or few AMPs. In all of these few-AMP examples, the first table is accessed using either a UPI or a USI.

The first join in the following example is made using a merge join. Because the primary index columns of both tables share the same domain, their associated rows reside on the same AMP, with the second table being joined based on a NUPI. The remaining joins use combinations of primary index and USI access and perform few-AMP nested joins between the tables.



One idiosyncrasy of EXPLAIN text is that it sometimes says “we do a two-AMP join step” when a USI is defined on one half of the nested join. Three AMPs are engaged for most such joins, not two.

Tactical Queries Benefit From Nested Joins

All AMPs are engaged for product and hash joins, and also for most merge joins. For this reason these join methods are less desirable for more localized queries. Tactical queries benefit from the few-AMP joins described in All-AMP Queries.

Of these join methods, the most beneficial for tactical queries is the nested join, which under specific conditions involves only a few AMPs.

A few-AMP nested join is possible when the first table can be accessed by a UPI or USI specified by the request. Following that, a foreign key within that first table must be available to drive a nested join into a second table, which contains its associated primary key. For this nested join to involve only one or a few AMPs, the second table must have either a primary index or a USI defined on the column set mapped to by the foreign key in the first table, as demonstrated by the following graphic: