Scenario 4 - Teradata VantageCloud Lake

Lake - Database Reference

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ohi1683672393549.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
ohi1683672393549

The previous scenarios show situations where a row-partitioned table is the obvious choice to enhance the performance of a query workload. This scenario examines a more ambiguous situation in which there are more tradeoff considerations and you cannot determine in advance one correct solution for all specific instances of the scenario.

An invoice table contains data about each invoice issued in the past four years. The unique primary index is invoice number. New rows are added nightly, using Teradata Parallel Transporter Update Operator, and the oldest month of data is deleted once each month.

Query Workload

A moderately heavy volume of queries requests information about one invoice. There are also ad hoc analysis queries that examine all invoices for a period of time, typically less than one year. Other tables have invoice number as their primary index, but do not have an invoice date column, so there are frequent joins with those other tables.

Problem Statement

The DBA wonders whether to partition the invoice table on invoice date using one-month ranges.

The primary index is defined as unique, but must be redefined as nonunique if the table is row-partitioned. There is a business requirement to guarantee that invoice numbers are unique, so the DBA must define a uniqueness constraint on the invoice number column. This uniqueness constraint, if added, creates an additional secondary index on the table (other than UPIs, all uniqueness constraints are implemented internally as USIs whether specified explicitly as a UNIQUE constraint, a PRIMARY KEY constraint, or a USI constraint. See Using Unique Secondary Indexes to Enforce Row Uniqueness), which increases processing on insert, delete, and update operations and requiring additional disk capacity to store the resulting secondary index subtable. The base table is also larger by two bytes per row, further increasing the required disk space.

Analysis of Partitioning Benefits

The primary index access queries that were run against the nonpartitioned version of this table must be reformulated to use the USI to access the row. Usually, accessing a row takes two to three times longer using a USI instead of a UPI. However, UPI access is a fast operation, so doubling or tripling the time may barely be noticeable to the users who issue those queries.

Without row partition elimination, direct merge joins require more memory and CPU usage and may be slower than a similar nonpartitioned table. The extent of performance degradation depends on query conditions, number of partitions that can be excluded, and the join plan chosen by the Optimizer. Actual measurement of representative queries is necessary to determine the overall difference in performance.

The nightly inserts benefit in the same way, and for the same reasons, as in Scenario 1. However, the additional index on invoice number partially offsets that benefit. The same considerations apply to the monthly delete operations.

The ad hoc queries examining months of invoices benefit in the same way as in Scenario 1. The benefit is greatest when fewer months are examined.

Does converting the invoice table to a partitioned table improve performance? The DBA must measure the degree of improvement and the extent of degradation in the types of query, and use that analysis to determine how much each query type contributes to the overall workload involving this table. This exercise produces a good estimate of the comparative workload performance against the table with and without partitioning.

If the measured performance difference between the otherwise equivalent partitioned and nonpartitioned tables is substantial, the choice may appear to be obvious. However, you must also weigh the relative importance to the enterprise of the activities in the workload.

For example, consider the following contingencies:
  • If the time required to perform the nightly volume of bulk inserts is beginning to exceed the time allotted for inserting new rows, then even a small improvement in load time may be sufficiently important to offset larger degradations in other aspects of the query workload.
  • Similarly, if the response time of the PI-access queries is critical, even a small performance degradation may be unacceptable, whether net workload performance is improved or not.

Conclusions

The decision whether to implement a table with or without partitioning can often be more of an optimization than a maximization. Careful and considered measurement, analysis, and evaluation are all required to make an optimal decision.