Column partitioning is a physical database design choice that can improve the performance of some classes of workloads. Columnar partitioning is expected to have the following performance impacts.
- You should expect a significant I/O performance improvement for queries that access a variable small subset of the columns either in predicates or as projections, and rows of a column-partitioned table.
For example, if 20% of the data in rows is needed to satisfy a query, the I/O for a column-partitioned table should be approximately 20% of the I/O for a table without column partitioning.
I/O might be further reduced depending on the effectiveness of autocompression. Additional I/O might be needed to reconstruct rows if many columns are projected or used in predicates.
- A potentially significant negative performance impact can occur for requests that access more than a small subset of the columns, rows, or both of a column-partitioned table.
- CPU usage might increase for handling autocompression, decompression, and containers. With a reduction in I/O and a possible increase in CPU use, workloads can change from being I/O-bound to being CPU-bound.
The performance of CPU-bound workloads might not improve with column partitioning.
- A potentially significant negative performance impact on INSERT operations for a column-partitioned table or join index, especially for single-row INSERT operations, but less so for bulk insert operations such as array INSERT and INSERT … SELECT statements.
- A potentially significant negative performance impact for UPDATE operations that select more than a small subset of rows to be updated, as described by the first 2 bullets. Teradata Database does update operations as a DELETE operation followed by an INSERT operation; therefore, all the columns of rows selected for UPDATE operations must be accessed.
There is a range of from 1 or more orders of magnitude of improved performance to 1 or more orders of magnitude of degraded performance for accessing and updating column-partitioned tables.
The greatest improvement occurs when there is a highly selective predicate on a column in a single-column partition of a table with hundreds or thousands of columns and only a few columns are projected.
The worst case is when a query is not very selective, most or all of the columns in the table are projected, there are thousands of column partitions, only 8 available column partition contexts, and the table is row-partitioned in a way that there are very few rows in a populated partition.