Skewed Query Processing Across AMPs - Teradata Database - Teradata Vantage NewSQL Engine

Teradata Vantage™ - Database Administration

Product
Teradata Database
Teradata Vantage NewSQL Engine
Release Number
16.20
Published
March 2019
Language
English (United States)
Last Update
2019-05-03
dita:mapPath
tgx1512080410608.ditamap
dita:ditavalPath
TD_DBS_16_20_Update1.ditaval
dita:id
ujp1472240543947
Product Category
Software
Teradata Vantage

AMP vprocs always run in parallel, but the way data rows or column values are striped across the disks affect the parallel operation of AMP step processing.

Unbalanced, or skewed, or spiked, disk loads can cause one or a few AMPs to be doing most of the I/Os. For example, when a numeric column allows zeros and/or nulls, the majority of rows might hash to the same AMP.

If your disk loads are poorly balanced, discuss with operations ways to correct the situation. For example:

  • Perhaps queries or views against a column with zeros/nulls could use “WHERE NOT NULL” or “NOT= 0” qualifiers.
  • If the cause is a NUPI, consider redefining the index, especially for a very large table, to achieve a higher percentage of uniqueness.

Common Skew Issues

Issue Cause Results
Same row hash value for an excessive number of rows
  • Rows with same row hash value cannot fit in a single data block.
  • Rows spill over into additional data blocks.
The table is skewed due to a highly nonunique primary index (PI).

As an estimate, more than 1000 occurrences/NUPI value begin to cause performance degradation problems. This figure is based on all the rows for the same NUPI value spilling over into more than five data blocks.

See Identifying and Fixing Skewed Tables.

  • Increased I/Os for updates.
  • Increased compares for inserts and FastLoad (more Central Processing Unit (CPU) and I/Os).
  • Performance degradation in the Restore and Table Rebuild utilities.
The data block size is set too low.

See Managing Data Block Usage.

Data block size should not be a problem on systems using block-level compression because the size should be set to the maximum.
Some AMPs have many more rows of a table than do other AMPs.
  • One or a few NUPI values have many more rows than all the other NUPI values.
  • A highly skewed join field in a join or aggregation, for example a join on a city code in a customer table where a large number of customers may be located in a few cities.
  • A highly skewed referencing column (for instance, a city code) when using referential integrity (RI).
  • Poor CPU parallel efficiency on the AMPs during full table scans and bulk inserts.
  • Increased I/O for updates, and increased compares for inserts (also more I/O).