Analysis of Row Partitioning Benefits - Teradata VantageCloud Lake

Lake - Database Reference

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ohi1683672393549.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
ohi1683672393549

The partitioning definition used for Scenario 1 partitioned by month because queries use months as their basic time unit. Another option is to attempt to optimize access further by changing the granularity of the partitioning scheme by partitioning at a finer level. Suppose the DBA considers partitioning the table by day instead of by month. The table now has about 760 partitions (25 months), with those corresponding to future dates in the current month being empty.

The results of telescoping the partition granularity for this table and query workload look like this:

  • The effort to delete the oldest month of data is the same as for the same table defined with 25 primary index partitions. The run time for that job is also the same. Making the simplifying assumption that all months have 30 days, the DBA must delete 30 smaller partitions with the oldest month of data instead of deleting one larger partition, but the same number of rows are deleted, and the elapsed time to delete the rows is the same.
  • The nightly inserts benefit from the finer partitioning because instead of being concentrated in one or two partitions out of 25, the inserted rows are now directed to three, four, or at most five partitions out of 760; well under 1% of the partitions. Most of the inserts are directed to one partition, the one containing the activity for the day newly completed.
  • The short-running primary index access queries are not impacted either way by having 760 partitions instead of 25.
  • The ad hoc queries that typically analyze two or three months of data are also little changed, now accessing 60 out of 760 partitions rather than 2 out of 25, the same percentage of the defined partitions for the primary index. Queries varying by the time of month, however, may realize performance gain with the larger number of partitions. For example, a query submitted on the fourth day of the month may analyze the current day and the previous 33 days of data, while a query submitted later in the month may restrict the analysis to that calendar month. The 34 day query involves 68 out of 760 partitions, instead of 4 out of 25, a significantly smaller percentage.
  • The analysis queries examining 24 months of data run in about the same time, because the queries touch most of the rows in the table in either case.
  • The joins are not impacted by the number of partitions because there are no direct joins against this table.