Performance Issues for Column-Partitioned Tables - Analytics Database - Teradata Vantage

SQL Data Definition Language Detailed Topics

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2023-07-11
dita:mapPath
vuk1628111288877.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
B035-1184
lifecycle
latest
Product Category
Teradata Vantage™
Column partitioning is a physical database design choice that can improve the performance of some classes of workloads. Columnar partitioning is expected to have the following performance impacts.
  • You should expect a significant I/O performance improvement for queries that access a variable small subset of the columns either in predicates or as projections, and rows of a column-partitioned table.

    For example, if 20% of the data in rows is needed to satisfy a query, the I/O for a column-partitioned table should be approximately 20% of the I/O for a table without column partitioning.

    I/O might be further reduced depending on the effectiveness of autocompression. Additional I/O might be needed to reconstruct rows if many columns are projected or used in predicates.

  • A potentially significant negative performance impact can occur for requests that access more than a small subset of the columns, rows, or both of a column-partitioned table.
  • CPU usage might increase for handling autocompression, decompression, and containers. With a reduction in I/O and a possible increase in CPU use, workloads can change from being I/O-bound to being CPU-bound.

    The performance of CPU-bound workloads might not improve with column partitioning.

  • A potentially significant negative performance impact on INSERT operations for a column-partitioned table or join index, especially for single-row INSERT operations, but less so for bulk insert operations such as array INSERT and INSERT … SELECT statements.
  • A potentially significant negative performance impact for UPDATE operations that select more than a small subset of rows to be updated, as described by the first 2 bullets. Vantage does update operations as a DELETE operation followed by an INSERT operation; therefore, all the columns of rows selected for UPDATE operations must be accessed.

There is a range of from 1 or more orders of magnitude of improved performance to 1 or more orders of magnitude of degraded performance for accessing and updating column-partitioned tables.

The greatest improvement occurs when there is a highly selective predicate on a column in a single-column partition of a table with hundreds or thousands of columns and only a few columns are projected.

The worst case is when a query is not very selective, most or all of the columns in the table are projected, there are thousands of column partitions, only 8 available column partition contexts, and the table is row-partitioned in a way that there are very few rows in a populated partition.