- Queries that access a small, but variable, subset of columns run more efficiently against a column-partitioned table when compared to a nonpartitioned table. The majority of queries against a column-partitioned table select a variable subset of columns, and project a variable subset of the columns, where the subset accessed is less than 10% of the partitions for any specific query. The number of partitions that need to be accessed for queries must not exceed the number of available column partition contexts.
- Use column partitioning for tables that have the following properties:
- Accessed using analytic queries.
- Not frequently refreshed using bulk data loading methods.
- Table partitions are seldom modified in the interim.
- Do not use column-partitioned tables for highly volatile data.
- Place the COLUMN partitioning level at the lowest partitioning level possible, ideally at a lower level than any ROW partitioning levels in the partitioning expression. The COLUMN partitioning level must be the first partitioning level you specify.
Considerations that can lead to putting column partitioning at a lower level include the following:
- Potential improvements for cylinder migration.
- Temperature-based block compression effectiveness for both hot and cold data.
- For queries that frequently access columns, but where the specific set of columns accessed varies from request to request, put frequently accessed columns into single-column partitions.
This action enables the Optimizer to use column partition elimination to optimize queries against the table.
- For queries that frequently access the same set of columns across the requests in a workload, group the frequently accessed columns into the same partition.
- If you find that autocompression is effective for a column, consider placing that column in a single-column partition even if that column is not frequently accessed.
- Autocompression is most effective for single-column partitions with COLUMN format, less so for multicolumn partitions, particularly as the number of columns increases, and not effective for column partitions with ROW format.
- Group columns into a column partition for applications where either of the following is true.
- Queries frequently access the columns.
- Queries do not frequently access the columns, and autocompression of the individual columns or subsets of columns is not effective.
- Use COLUMN format for narrow column partitions, especially if you find that autocompression is effective for a partition.
If the system-determined format is not COLUMN for a column partition, but you determine that COLUMN is more appropriate, specify COLUMN explicitly when you create or alter the table.
You may need to specify COLUMN format explicitly for a column partition that has a column with a VARCHAR, CHARACTER SET VARGRAPHIC, or VARBYTE data type defined with a large maximum value, but where the partition values are relatively short in most cases.
This can happen when the system-determined format is ROW because of a large maximum value length.
- Use ROW format for wide column partitions, because this format has less overhead than a container that holds a small number of values.
If the system-determined format is not ROW for a column partition, but you determine that ROW is more appropriate, specify ROW explicitly when you create or alter the table.
- Accept the default DATABLOCKSIZE for a column-partitioned table unless performance analysis indicates otherwise.
For Object File System tables, DATABLOCKSIZE has a default setting that you cannot change.
- If you expect to add table rows incrementally, allocate additional free space if a column-partitioned table has small internal partitions. This can occur if a table is also row-partitioned.
To allocate additional space, you can specify a value for the FREESPACE option that is larger than the system default when you create the table.
For Object File System tables, FREESPACE has a default setting that you cannot change.
If you plan to load the table with rows using a large INSERT … SELECT statement, and the internal partitions for the table are either large or unpopulated, little or no free space is required.
The reserved free space enables table data to expand on current table cylinders, preventing or delaying the need for additional table cylinders to be allocated, which prevents or delays data migration associated with new cylinder allocations.
- Keeping new table data physically close to existing table data, and avoiding data migrations, can improve overall system performance.
The following usage guidelines apply to column-partitioned table applications.