Guidelines for Keeping NUPI Duplicates Below 100 Rows Per Value - Teradata Database

Teradata Database Design

Teradata Database
Release Number
English (United States)
Last Update
Product Category

Guidelines for Keeping NUPI Duplicates Below 100 Rows Per Value

As you can see in the previous table, beyond more than a very few rows, the number of logical reads required to insert duplicate NUPI value rows is prohibitive if duplicate row checks are done.

The file system must read all the rows on an AMP that have the same row hash in order to determine their uniqueness values before it can insert a new NUSI duplicate row. Beyond 100 rows per value, the performance decrement created by that activity becomes significant, so you should always limit NUPI duplicates to fewer than 100 rows per value for any table. This figure is dependent on data block and row sizes: the larger the row size, the fewer rows fit into a single data block. Similarly, the larger the data block, the more rows, on average, fit into it. Keep this in mind as you read this section and realize that the 100 rows per value measure is a relative, not an absolute, criterion.

The figure of 100 duplicates per NUPI value is based on the overwhelming likelihood that duplicate NUPI value rows will spill over into more than five data blocks. When the number of duplicate NUPI values in a table exceeds 100, performance always degrades significantly. The specific operations affected are described in the following table.



                             Cause of Performance Degradation


Increased I/O

  • Inserts
  • FastLoads
  • Increased comparisons, resulting in greater I/O activity and CPU usage

    Note that performance for the following operations is also degraded significantly when the number of duplicate NUSI row values exceeds 100:

  • Restore operation of the Archive/Recovery utility
  • Table rebuilds using the Table Rebuild utility
  • These performance problems persist even for the unlikely case where every NUPI value in a table has the same number of duplicates, thus eliminating skew as a factor in the decrement.