15.00 - Performance Issues for Column-Partitioned Tables - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

Teradata Database
Programming Reference

Performance Issues for Column‑Partitioned Tables

Column partitioning is a physical database design choice that can improve the performance of some classes of workloads. Columnar partitioning is expected to have the following performance impacts.

  • You should expect a significant I/O performance improvement for queries that access a variable small subset of the columns either in predicates or as projections, and rows of a column-partitioned table.
  • For example, if 20% of the data in rows is needed to satisfy a query, the I/O for a column‑partitioned table should be approximately 20% of the I/O for a table without column partitioning.

    I/O might be further reduced depending on the effectiveness of autocompression. Additional I/O might be needed to reconstruct rows if many columns are projected or used in predicates.

  • A potentially significant negative performance impact can occur for requests that access more than a small subset of the columns, rows, or both of a column-partitioned table.
  • CPU usage might increase for handling autocompression, decompression, and containers. With a reduction in I/O and a possible increase in CPU use, workloads can change from being I/O‑bound to being CPU‑bound.
  • The performance of CPU‑bound workloads might not improve with column partitioning.

  • A potentially significant negative performance impact on INSERT operations for a column-partitioned table or join index, especially for single-row INSERT operations, but less so for bulk insert operations such as array INSERT and INSERT … SELECT requests.
  • A potentially significant negative performance impact for UPDATE operations that select more than a small subset of rows to be updated, as described by the first 2 bullets. Teradata Database does update operations as a DELETE operation followed by an INSERT operation; therefore, all the columns of rows selected for UPDATE operations must be accessed.
  • There is a range of from 1 or more orders of magnitude of improved performance to 1 or more orders of magnitude of degraded performance for accessing and updating column‑partitioned tables.

    The greatest improvement occurs when there is a highly selective predicate on a column in a single‑column partition of a table with hundreds or thousands of columns and only a few columns are projected.

    The worst case is when a query is not very selective, most or all of the columns in the table are projected, there are thousands of column partitions, only 8 available column partition contexts, and the table is row‑partitioned in a way that there are very few rows in a populated partition.