Column Partitioning | Database Design | Teradata Vantage - Column Partitioning - Advanced SQL Engine

Column Partitioning | Database Design | Teradata Vantage - Column Partitioning - Advanced SQL Engine - Teradata Database

Database Design

Product

Advanced SQL Engine

Teradata Database

Release Number

17.05

17.00

Published

June 2020

Language

English (United States)

Last Update

2021-01-22

dita:mapPath

qby1588121512748.ditamap

dita:ditavalPath

lze1555437562152.ditaval

dita:id

B035-1094

lifecycle

Product Category

Teradata Vantage™

This topic describes the structure of column-partitioned tables and join indexes.

When autocompression methods are described, they are described only to indicate how their use affects the content and size of the row header. Because Teradata Database determines whether to use autocompression and which autocompression methods to use for a container, they are not documented any further in the Teradata Database documentation library.

Options for Storing Data on an AMP

The following graphic shows four options for storing a table in an AMP, where the darkened row for each table represents a table header, one on data for each AMP, for a physical table. The table header defines the layout and other information about the physical table.

Moving from left to right, the structures represent the layouts of four different methods of storage.

Traditional Teradata Database row storage without partitioning
A column-partitioned table with single-column containers (COLUMN format), one for each column
A column-partitioned table with single-column containers for columns A and B; multicolumn subrows (ROW format) for columns C and D
A column-partitioned table with single-column containers for columns A and B; multicolumn containers for columns C and D

About Column-Partitioned Tables

If a request requires multiple columns to return the requested result set, the Optimizer query plan includes assembling projected column values from selected table rows together to form the result rows. This can be combined with row partition elimination to further reduce the data that must be accessed to satisfy a query.

Column partitioning has various system-applied compression techniques available to it that can be employed automatically to reduce the storage requirements for a table. These compression methodologies can reduce the I/O required to process queries and, when combined with row partitioning, the number of compression opportunities might increase.

The following CREATE TABLE request creates a table definition with four column partitions, two of which are internal partitions that Teradata Database creates for all column-partitioned tables.

The theoretical maximum number of column partitions for this table is 65,534, including the two column partitions for internal use, and the maximum column partition number is 65,535. The difference is caused by there always being at least one unused column partition number available for altering a column partition.

However, this table can actually have no more than 2,050 column partitions because the total number of columns for a table cannot exceed 2,048. To have 2,050 column partitions, each user-defined column partition could have only one column.

CREATE TABLE sales_1 (
 (storeid        INTEGER NOT NULL,       /* Partition 1 */
  productid      INTEGER NOT NULL,
  salesdate      DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  totalrevenue   DECIMAL(13,2)),
  (totalsold     INTEGER,                /* Partition 2 */
  topsalesperson INTEGER,
  note   VARCHAR(356)))
PRIMARY AMP INDEX (storeid, productid) 
PARTITION BY COLUMN; -- Defines 2 multicolumn partitions plus 2
                     -- additional column partitions for internal
                     -- use.

The following example creates a table definition for a column-partitioned table with row partitioning.

The COLUMN specification in the PARTITION BY clause defines 7 column partitions: 5 column partitions based on the column grouping in the column definition list plus 2 additional column partitions for internal use.

The maximum number of column partitions is 779 including the 2 column partitions for internal use. This means that it is possible to add 770 column partitions to this table. The maximum column partition number is 780.

The RANGE_N function in the partitioning expression defines 48 row partitions with the option to add up to 36 row partitions for a maximum of 84 row partitions. The maximum row partition number is 84.

The maximum combined partition number for this table is 65,520, which is computed from the following product: 780*84. This is the product of the maximum partition number of each partitioning level. The table has 2-byte partitioning because the number of combined partitions is not greater than 65,535.

If you were to create this table using multicolumn partitions, you could add columns up to the maximum of 2,048. You could also specify the ADD option to specify a higher number of column partitions that could be added, but that would result in the table using 8-byte partitioning rather than 2-byte partitioning.

CREATE TABLE sales_2 (
       storeid        INTEGER NOT NULL,
       productid      INTEGER NOT NULL,
       salesdate      DATE FORMAT 'yyyy-mm-dd' NOT NULL,
       totalrevenue   DECIMAL(13,2),
      (totalsold      INTEGER,
       topsalesperson INTEGER,
       note            VARCHAR(256)))
     PRIMARY AMP INDEX (storeid, productid)
     PARTITION BY (COLUMN,  -- Defines 4 single-column partitions and
                            -- 1 multicolumn partition plus 2
                            -- additional column partitions for internal
                            -- use.
                   RANGE_N(salesdate BETWEEN DATE '2007-01-01'
                                     AND     DATE '2010-12-31'
                                     EACH INTERVAL '1' MONTH)
                   ADD 36);

You can use column partitioning to improve query performance through column partition elimination.

You can use row partitioning to improve query performance through row partition elimination, which reduces the need to access all of the rows in a table. Teradata Database uses efficient methods to reconstruct qualifying rows from the projected columns of the column partitions that are not eliminated.

Advantages of column partitioning include a simple CREATE TABLE syntax with default autocompression, the ability of the Optimizer to perform column partition elimination, and to facilitate efficient access data from column partitions. Teradata Database manages the column partitions so that the table appears to you as a single object.