15.00 - Example: Column-Partitioned Table With No Autocompression - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Example: Column‑Partitioned Table With No Autocompression

The following SQL text is the table definition for the orders column‑partitioned table with no autocompression.

     CREATE TABLE orders (
       (order_num   INTEGER NOT NULL),
       (order_date  DATE NOT NULL),
       (item_num    INTEGER COMPRESS NULL),
       (item_desc   CHARACTER(30) COMPRESS NULL CHARACTER SET LATIN) 
                    )
     NO PRIMARY INDEX
     PARTITION BY COLUMN NO AUTO COMPRESS;

NO AUTO COMPRESS can be omitted if the default settings are changed to NO AUTO COMPRESS. Because a column‑partitioned table is also a multiset table, Teradata Database does not do any duplicate row checks. Column‑partitioned tables are also similar to nonpartitioned NoPI tables in that Teradata Database randomly distributes rows or blocks of rows to the AMPs or are copied locally as is done for an INSERT … SELECT request instead of being hash‑distributed as is done for a table with a primary index. Teradata Database appends each column partition value of a row for an AMP to the column partition or to the combined partition if multilevel partitioning is being used on that AMP.

This example examines a column‑partitioned table with four column partitions (plus two for internal use) with one column per user‑specified partition. The user‑specified column partitions are not autocompressed and they all have COLUMN format.

COLUMN format is a series of consecutive column partition values stored in a container, where the rowID of the container is the standard rowID consisting of an internal partition number, a hash bucket number, and a uniqueness value associated with the first column partition value in the container. The internal partition number in this case represents the column and row partition numbers, if any.

The rowIDs associated with subsequent column partition values are implied by their position in the container. The rowIDs associated with the column partition values in the container, which are explicit for the first and implied for the remainder, have the same internal partition number and hash bucket, and the uniqueness increases by one for each column partition value represented by the container.

The file system stores a container based on the rowID in its row header as is currently done for a physical row representing a row in a table that is not column‑partitioned.

A column partition has as many containers as needed to hold all of the column partition values of the table columns included in that partition. Teradata Database stores the column partition values in the inserted row order within a container, and the containers are in row‑insertion order within a combined partition. To simplify the example, only two containers are shown for each of the column partitions, and each container only represents a small number of column-partition values.

Containers for different column partitions can have a different number of values represented per container. In this example, the first container for item_desc represents 5 column values, while the first container for each of the other columns represent 6 column values. The second container for item_desc starts at uniqueness value 6, while the second container for each of the other columns starts at uniqueness value 7.

Because no rows are deleted, the internal delete column partition is empty. Note that the first uniqueness value for a container plus the number of column partition values represented by that container does not equal the first uniqueness value of the next container if the intervening containers are deleted or if the internal partition number or hash bucket changes for the next container.

A null column in the table takes no place in the column value list.

The internal delete column partition, partition number 65,535 is empty in this case because the table is column-partitioned.

The table header and user data for this example would look something like this.

 

where:

 

Table element …

Specifies …

Partition Number

a 2‑byte or 8‑byte internal partition number indicating the column partition and row partitions for a container.

Bits in the flag byte of a physical row indicate the partitioning, which in this case is 2‑byte partitioning, and a non‑0 2‑byte part of the internal partition number exists in a physical row of this column‑partitioned table.

Hash Bucket

a 20‑bit (or 16‑bit) hash bucket value.

n

the lowest numbered hash bucket for this AMP from the NoPI hash map.

This value increases to the next hash bucket for this AMP if the uniqueness overflows.

Teradata Database does not compute n by hashing the values of any columns, but whatever AMP the row is randomly sent initially uses its lowest numbered hash bucket from the NoPI hash map.

Uniqueness

a uniqueness value.

  • The value is 48 bits for a 16‑bit hash bucket.
  • The value is 44 bits for a 20‑bit hash bucket.
  • If the uniqueness value overflows, it resets to 1.

    NAC

    no autocompression.

    N

    a nullable column with COMPRESS NULL.

    Presence Bits

    null compression with a presence indicator bit.

    If the presence bit is 1, a value is present; otherwise, no value is present. This means the field is null, and the column takes no space because COMPRESS NULL is specified.

    No presence bits exist if the column is defined as NOT NULL.

    In the example, only 2 presence bits are shown per row because only 2 columns in the table, item_num and item_desc, are nullable.