15.00 - Example: Column-Partitioned Table With Autocompression - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Example: Column‑Partitioned Table With Autocompression

The following SQL text is the table definition for a column‑partitioned table with autocompression.

     CREATE TABLE orders (
       order_num   INTEGER NOT NULL,
       order_date  DATE NOT NULL,
       item_num    INTEGER,
       item_desc   CHARACTER(30) CHARACTER SET LATIN)
     NO PRIMARY INDEX
     PARTITION BY COLUMN AUTO COMPRESS;

AUTO COMPRESS can be omitted as long as the default settings have not been changed to NO AUTO COMPRESS. With the exception of the column partitions being autocompressed in this example, it is identical to the example used in “Example: Column‑Partitioned Table With Autocompression” on page 292.

A column field in the table takes no place in the column value list.

Values in italic typeface are in the local value list dictionary for the container.

The internal delete column partition, partition number 65,535 in this case because the table is only column-partitioned, is empty.

 

where:

 

Table element …

Specifies …

Partition Number

a 2‑byte or 8‑byte internal partition number indicating the column partition and row partitions for a container.

Bits in the flag byte of a physical row indicate the partitioning, which in this case is 2‑byte partitioning, and a non‑0 2‑byte part of the internal partition number exists in a physical row of this column‑partitioned table.

Hash Bucket

a 20‑bit (or 16‑bit) hash bucket value.

n

the lowest numbered hash bucket for this AMP from the NoPI hash map.

This value increases to the next hash bucket for this AMP if the uniqueness overflows.

Teradata Database does not compute n by hashing the values of any columns, but whatever AMP the row is randomly sent initially uses its lowest numbered hash bucket from the NoPI hash map.

Uniqueness

a uniqueness value.

  • The value is 48 bits for a 16‑bit hash bucket.
  • The value is 44 bits for a 20‑bit hash bucket.
  • If the uniqueness value overflows, it resets to 1.

    AC

    autocompression for containers of the column

    N

    a nullable column.

    ACT

    autocompression type.

    The internal delete column partition, column partition 65,535, is empty because the table is column‑partitioned.

    The following bullets explain the autocompression used for each container in this example.

  • First OrderNum container.
  • Teradata Database applied 2 autocompression techniques to this container.

  • The first two bits of each set of 4 autocompression bits are for R 2 (run length) autocompression, where the unsigned value of the two bits is the run length.
  • The second two bits are for TT01 2,0 (trim high-order bytes that are zero) autocompression.
  • The number of sets of bits is the same as the number of values in this container since nulls do not need to be represented by this container.

    This container represents 6 values, with the uniqueness for rowIDs starting at 1 and continuing up to 6. Because the first value, 100, has a run length of 3, the second value, 290, has a run length of 2, and the last value, 450, has a run length of 1.

    The TT01 bits for the first value, 100, means that the value 100 is stored in the container as a single byte, and the 3 high‑order 0 bytes are stripped.

    The TT01 bits for the second value, 290, means that the value is stored as two bytes, and the 2 high-order 0 bytes are stripped.

    The third value, 450, means the value is stored as 2 bytes, and the 2 high-order bytes are stripped, even though the data type is INTEGER which normally requires 4 bytes.

  • Second OrderNum container.
  • Teradata Database applied only TT01 2,0 autocompression to this container because it has no runs.

    There is one set of bits for each value, but in this case, there are only 2 bits per value because run length bits are not included. The container represents a sequence of 5 values, with the uniqueness value for rowIDs starting at 7 as indicated by the rowID of the container and continuing up to 11.

    Each of the 5 values is stored as 2 bytes and the 2 high-order 0 bytes are stripped from each value.

  • First OrderDate container.
  • Teradata Database applied R 2 (run length) autocompression to this container.

    The number of sets of bits is the same as the number of values in this container because it does not need to represent nulls.

    The container represents 6 values, with the uniqueness value for rowIDs starting at 1 and continuing up to 6. Because the first value, 07-29-2009, has a run length of 3, the second value, 08-04-2009, has a run length of 2, and the last value, 09-01-2009, has a run length of 1.

  • Second OrderDate container.
  • Teradata Database applied PV 3,0,1 (single null and local value list) autocompression to this container.

    In this case, there are 3 bits for each value represented, but only 3 values and 1 local value list dictionary value are stored in the container.

    The container represents a sequence of 5 values, with the uniqueness value for rowIDs starting at 7 and continuing up to 11. The autocompression bits for the first 2 values indicate that their values are not present and to use the first entry in the local value list dictionary as their values.

    The autocompression bits for the next 3 values indicate the value is present, not compressed, and an index to the value in the list of values.

  • First ItemNum container.
  • Teradata Database applied P (single null) autocompression to this container.

    There is one bit for each of the 6 values represented, but only 5 values in the container because ItemNum is null for the fifth row of the table and a value is not present.

    Note that a null takes no space even though COMPRESS NULL is not specified because autocompression is applied. It is also possible that space for the column partition value could be in the container even if COMPRESS is specified because the system determines the compression used when autocompression is in effect.

  • Second ItemNum container.
  • Teradata Database applied PV 3,0,1 (single null and local value list) autocompression to this container.

    In this case, there are 3 bits for each of the 5 values represented but only 2 values and 1 local value list dictionary value are stored in the container.

    This container represents a sequence of 5 values (with the uniqueness for rowIDs starting at 7 and continuing to 11). The autocompression bits for the first and third values indicate that the values are present (437 and 815, respectively), not compressed, and that there is an index to the value in the list of values.

    The bits for the second value indicate that it is null and not present. The bits for the fourth and fifth values indicate that their values are not present and that Teradata Database should use the first entry in the local value list dictionary as their values.

  • First ItemDesc container.
  • Teradata Database applied TL01 4,' ' (trim trailing single‑byte space for fixed-length data type) autocompression to this container.

    In this case, there are 4 bits for each of the 5 values represented and 5 values (stripped of trailing spaces) in the container.

    The 4 bits for a value indicate its length after trimming trailing spaces. The uncompressed value can be obtained from the value in the container by adding trailing spaces as needed to a fixed length of 30. In this case, the first container represents only 5 values, but the total number of values represented by both containers of ItemDesc is 11, the same as the total for the other columns.

  • Second ItemDesc container.
  • Teradata Database applied PV 3,0,1 (single null and local value list) and TL01 4,' ' (trim trailing single‑byte space for fixed‑length data type) autocompression to this container.

    In this case, there are 7 bits for each of the 6 values represented, but only 3 values and 1 local value list dictionary value are stored in the container.

    The first 3 bits indicate whether the value is null (00) and not present, is not present and to use the entry in the local value-list dictionary (01) as its value, or present (010, 011, 100, 110, 111 indicating an index in the list of values). The other 4 bits indicate the length of the value, if present, after trimming trailing spaces. The length is 0000 if the field is null, so it is ignored.

    This container represents a sequence of values, with the uniqueness for rowIDs starting at 6 as indicated by the rowID of the container and continuing to 11. The autocompression bits for the first and fourth values indicate that their values (Light Bulb) are not present and to use the entry in the local value list dictionary as their values.

    The bits for the second, third, and sixth values (1-inch Nails, Drill, and Hammer, respectively) indicate that the values are present, but stripped of trailing spaces. The bits for the fifth value indicate that it is null and not present.

    The stripped values can be extended with spaces to their fixed length of 30 when decompressing the values.

    The types of autocompression that Teradata Database chooses to apply to the containers for a table are chosen on a container by container basis, as this example demonstrates. Teradata Database could just as easily apply other autocompression types, depending on the context for each container to which autocompression was applied. Optionally, some column partitions can be set for AUTO COMPRESS and some for NO AUTO COMPRESS by specifying AUTO COMPRESS or NO AUTO COMPRESS as appropriate for each column partition to override the default.