15.00 - Guidelines for Queries, Contexts, and Table Maintenance for Column Partitioning - Teradata Database

Teradata Database Design

Teradata Database
User Guide

Guidelines for Queries, Contexts, and Table Maintenance for Column Partitioning

  • The optimal application for column‑partitioned tables is large fact tables or call detail tables that are accessed frequently by analytic workloads.
  • Workloads that rely heavily on joins and aggregations based on a primary index are not suited for column partitioning because column‑partitioned tables do not have a primary index.

    A join index might provide acceptable performance for requests that depend on primary index access, while column partitioning the base table could enable superior performance for other classes of workloads.

    Use column partitioning for tables accessed by analytic queries where the tables are refreshed using bulk loading techniques periodically with possibly some interim modifications.

    Do not use column‑partitioned tables for highly volatile data, for cases in which a table is lightly populated with rows, or if the data is to be used in tactical query workloads.

  • To facilitate query performance, you should make sure that you follow at least one, if not both, of the following guidelines.
  • Ensure that the number of column partitions accessed by a request does not exceed the number of available column partition contexts.
  • Write the request in such a way that it is highly selective.
  • Ideally, you should ensure that both of these guidelines are followed.

    Requests in workloads that heavily access column‑partitioned tables and do not conform to this recommendation should be minimized, because their performance is likely to be degraded.

    To support acceptable performance for queries that requests that have the desired characteristics, you should employee physical database design options like secondary and join indexes when possible. Before you employee indexes, you must understand that additional maintenance costs are always incurred because Teradata Database must update index subtables any time the base table columns they are defined on are updated.

  • Because neither nonpartitioned NoPI tables nor column‑partitioned tables have a primary index, they are not good choices if typical queries run against them would benefit from primary‑indexed access. In this case, consider indexing the table using secondary or join indexes once its initial configuration is determined not to achieve adequate performance for its typical workloads.
  • Always measure the INSERT cost for tables that are candidates for column partitioning.
  • Do not use column partitioning when the increased cost of inserting data is not acceptable or is not offset by improved performance in the query workload.

  • Because you should not use column‑partitioned tables for highly volatile data, apply UPDATE operations to column‑partitioned tables sparingly.
  • Perform DELETE operations on a column‑partitioned table either for the entire table or for entire row partitions.
  • The Optimizer uses the DBS Control parameter PPICacheThrP to determine the number of available file contexts that can be used at one time to access a partitioned table.
  • If the number of available file contexts determined by PPICacheThrP is less than 8, then 8 files contexts are available. If the number of file contexts specified by PPICacheThrP is more than 256, then 256 file contexts are available. For a column‑partitioned database object, Teradata Database uses the number of file contexts as the number of available column partition contexts.

    A file context can be associated with each column partition context for some operations, but in other cases, Teradata Database might allocate a buffer to be associated with each column partition context.

    The ideal number of column partition contexts should be at least equal to the number of column partitions that need to be accessed by a query; otherwise, performance can degrade since not all the needed column partitions can be read at one time. Performance and memory usage can be impacted if PPICacheThrP is set too high, because too high a setting can lead to memory thrashing or even a system crash.

    At the same time, the benefits of partitioning can be lessened if PPICacheThrP is set unnecessarily low, and performance might degrade significantly. The default setting for this parameter is expected to be optimal for most workloads; however, after monitoring performance and memory usage, you might need to adjust the setting to obtain the best balance.

  • You should periodically refresh or append new rows to a column‑partitioned table, or to the row partitions of the column‑partitioned table, using INSERT … SELECT requests that move large quantities of data.
  • Date or timestamp can partitioning may help to improve column‑partitioned table maintenance.

    See “Bulk Loading a Column‑Partitioned Table” on page 318 for more information.