Row Distribution | Teradata Vantage - Primary Index and Skewed Row Distribution - Analytics Database - Teradata Vantage

Database Administration

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
ft:locale
en-US
ft:lastEdition
2024-10-04
dita:mapPath
pgf1628096104492.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
ujp1472240543947
lifecycle
latest
Product Category
Teradata Vantageā„¢

The hash value of the PI for a row determines where the row is stored on the AMP. In a normal environment, hash values are evenly distributed across the nodes and the AMPs within a node.

The less unique the values are for the index, the less evenly the rows of that table are distributed across the AMPs. If a table has a NUPI with thousands of instances of a single value, the table can become skewed.

Performance Effects of a Primary Index

  • The more unique the PI, the more unique the row hash value.
  • The more unique the row hash value, the more even the data distribution across all the AMP vprocs. Even data distribution enhances parallel efficiency on full table scan operations.
  • UPIs generate the most even distribution of table rows across all AMP vprocs.
  • NUPIs generate even row distribution to the same degree that values for a column or columns are unique. Rows with the same row hash always go to the same AMP, whether they are from the same table or from different tables.
To determine the best PI for a table, factor in the:
  • Extent of update activity
  • Number of full table scans
  • Join activity against the PI definition
  • Frequency of PI as a selectivity column and, therefore, a potential access path

If a PI causes uneven data distribution, you should re-specify it according to the primary index design best practices shown in Teradata Vantageā„¢ - Database Design, B035-1094.