Workload Characteristics and Partitioning
Consider the following points when you define the partitioning for a table or join
At minimum, you must choose a primary index (or specify no primary index) to achieve
an even distribution of rows to the AMPs.
When appropriate, the primary index column set ought to be constructed from columns
that are often constrained by equality conditions in queries in order to provide fast
access, joins, and aggregations on those columns.
If the partitioning column set is included in the primary index column set, some concerns
are not an issue. However, you would not add the partitioning columns to the primary
index to avoid these concerns anyway, because there is usually little, if any, benefit
in doing so.
If the partitioning columns are included in the set of primary index columns, then
you can specify the primary index to be UNIQUE.
Adding a partitioning column as a primary index column just to make the primary index
unique is not effective, however, because it is the original set of primary index columns that needs to be unique and possibly used for access,
joins, and aggregations.
If the partitioning columns are also included in the set of primary index columns,
it might be a good choice to define many combined partitions for primary index access
and joins. Be aware that plan costing can be affected if there are too many empty
combined partitions if PARTITION statistics are not collected. Other factors can also
lead to having fewer combined partitions.
If all of the partitioning columns are not included in the primary index column set,
the primary index cannot be defined as a UPI.
If the primary index columns must be unique, define a USI on them.
Because MultiLoad and FastLoad do not support USIs, you must use another load strategy
such as any of the following:
Loading the rows using Teradata Parallel Data Pump (see Teradata Parallel Data Pump Reference).
Loading the rows into a staging table followed by an INSERT … SELECT or MERGE into
the target table using error logging (see “CREATE ERROR TABLE” in SQL Data Definition Language and “INSERT … SELECT” and “MERGE” in SQL Data Manipulation Language).
Dropping the USI, loading the rows using MultiLoad or FastLoad, and then recreating
the USI (see Teradata FastLoad Reference and Teradata MultiLoad Reference).
Including the partitioning columns in the PI, noting the previously documented problems
with this approach.
You must evaluate the tradeoffs among these choices carefully.
If the primary index columns need to be an efficient access path and there are many
combined partitions, consider one of these options:
Defining a USI on the primary index columns to improve access time.
Defining a NUSI on the primary index columns to improve access time.
Creating a join index to cover queries made against the table.
Creating a hash index to cover queries made against the table.
Defining a USI or NUSI on the primary index columns to improve access time.
Consider defining fewer combined partitions when a table is also accessed or joined
on the primary index.