Primary Index and Skewed Row Distribution - Teradata Database

Teradata Database Administration

Product
Teradata Database
Release Number
15.10
Language
English (United States)
Last Update
2018-10-06
Product Category
Software

Primary Index and Skewed Row Distribution

The hash value of the PI for a row determines where the row is stored on the AMP. In a normal environment, hash values are evenly distributed across the nodes and the AMPs within a node.

The less unique the values are for the index, the less evenly the rows of that table are distributed across the AMPs. If a table has a NUPI with thousands of instances of a single value, the table can become skewed.

Performance Effects of a Primary Index

  • The more unique the PI, the more unique the row hash value.
  • The more unique the row hash value, the more even the data distribution across all the AMP vprocs. Even data distribution enhances parallel efficiency on full table scan operations.
  • UPIs generate the most even distribution of table rows across all AMP vprocs.
  • NUPIs generate even row distribution to the same degree that values for a column or columns are unique. Rows with the same row hash always go to the same AMP, whether they are from the same table or from different tables.
  • To determine the best PI for a table, factor in the:

  • Extent of update activity
  • Number of full table scans
  • Join activity against the PI definition
  • Frequency of PI as a selectivity column and, therefore, a potential access path
  • If a PI causes uneven data distribution, you should re-specify it according to the primary index design best practices shown in Database Design.