The hash value of the PI for a row determines where the row is stored on the AMP. In a normal environment, hash values are evenly distributed across the nodes and the AMPs within a node.
The less unique the values are for the index, the less evenly the rows of that table are distributed across the AMPs. If a table has a NUPI with thousands of instances of a single value, the table can become skewed.
Performance Effects of a Primary Index
- The more unique the PI, the more unique the row hash value.
- The more unique the row hash value, the more even the data distribution across all the AMP vprocs. Even data distribution enhances parallel efficiency on full table scan operations.
- UPIs generate the most even distribution of table rows across all AMP vprocs.
- NUPIs generate even row distribution to the same degree that values for a column or columns are unique. Rows with the same row hash always go to the same AMP, whether they are from the same table or from different tables.
- Extent of update activity
- Number of full table scans
- Join activity against the PI definition
- Frequency of PI as a selectivity column and, therefore, a potential access path
If a PI causes uneven data distribution, you should re-specify it according to the primary index design best practices shown in Teradata Vantage™ - Database Design, B035-1094.