15.00 - Containers and Subrows - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Containers and Subrows

The data in a column‑partitioned table can be stored in containers or subrows. The following table defines these terms.

Containers and Space

If Teradata Database can pack many column partition values into a container, this form of compression, called row header compression, can reduce the space needed for a column‑partitioned table or join index compared to the same object without column partitioning.

If Teradata Database can place only a few column partition values in a container because of their width, there can actually be a small increase in the space needed for a column‑partitioned table or join index compared to the same object without column partitioning. In this case, ROW format may be more appropriate.

If there are only a few column partition values (because it is possible with row partitioning that only a few column partition values occur for each combined partition) and there are many column partitions, there can be a very large increase in the space needed for a column‑partitioned object compared to the same object without column partitioning. In the worst case, the space required for a table can increase by nearly 24 times.

In this case, consider making one of the following changes:

  • Alter the column or row partitioning to allow for more column partition values per combined partition.
  • Remove the column partitioning from the table or join index.
  • Container Row Contents

    A container row must have the same internal partition number and hash bucket for all the column partition values in that container row.

    Following are the contents of a container:

  • The row header for a container indicates its internal partition number, hash bucket, and uniqueness value for the first column partition value. The row header is the same as for any other physical row, including physical row length, a rowID, flag byte, and first presence byte. The row header for a container is either 14 or 20 bytes long.
  • There is only one row header for a container, using the rowID of the first column partition value as the rowID of the container instead of there being a row header for each column partition value as is the case for all other row types in Teradata Database. You can determine the rowID of a column partition value by its position within the container.

    Note that the first presence byte for a container is not used as a presence byte.

  • The first presence byte for a container is not used as a presence byte. It indicates whether autocompression types have been determined for the container.
  • If the selected autocompression types, which might include applying user‑specified compression, do not reduce the size of the container row, the AC bit in the first presence byte in the row header is set to 0 and the container row is not autocompressed.

  • The number of column partition values, including values for logically deleted rows, is represented by the container and various offsets to its sections.
  • An optional series of column partition values for the local value list compression dictionary.
  • This is preceded by arrays of offsets if the values are variable length.

  • A series of fixed‑length column partition values or a series of variable‑length column partition values, each prefixed by a 1-byte or 2-byte length.
  • The series can be empty if the autocompression bits do not indicate that any column partition values are present. If this occurs, the column partition values have all been compressed.

  • 0 or more bytes of free space.
  • Optional autocompression presence bits, value length compression bits, algorithmic compression bits, and run length bits, depending on the autocompression techniques used in the container row in reverse order of the series of values.
  • A column container should have thousands of column values for fixed length and short variable‑length data types unless the table is overly row‑partitioned. The values and presence bits grow closer to one another as they consume the available free space. If there is insufficient free space, Teradata Database expands the row.

    For a single‑column partition that has COLUMN format (that is, its physical rows are containers), a column partition value is the same as a column value and can have either a fixed or a variable length.

    For a multicolumn partition with COLUMN format, a column partition value has a structure similar to a regular row, containing presence and compression bits as 0 or more bytes, offsets to variable‑length column values, the values of its fixed‑length columns, the values of its uncompressed columns, and the values of its variable‑length columns.

    A container does not include:

  • A row length because the length of a column partition value is handled separately
  • An internal partition number
  • A hash bucket
  • A uniqueness value
  • Presence or compression bits
  • First presence bit
  • Teradata Database applies user‑specified compression within the multicolumn column partition value and might apply autocompression to the column partition value as a whole.

    Column partition values for a single‑column or multicolumn column partition can have either fixed or variable length. If the column partition value has variable length, the length is not part of the column partition value. Instead, the length is specified in the container row by a preceding length field or by using the difference between offsets if the column partition value is in the local value-list dictionary.

    A container includes other information such as offsets to the beginning of the series of column partition values.

    Containers and Autocompression

    A container with autocompression includes:

  • 2 bytes are used as an offset to the compression bits.
  • 1 or more bytes indicate the autocompression types and their arguments for the container.
  • 1 or more bytes of autocompression bits, depending on the number of column partition values and the autocompression type.
  • 0 or more bytes are used for a local value-list dictionary.
  • 0 or more bytes are used for present column partition values.
  • A container without autocompression uses 0 or more bytes for present column partition values. A container can exist without autocompression because:

  • You specified NO AUTO COMPRESS for the column partition when you created the table or join index.
  • No autocompression types are applicable for the column partition values of the container.
  • Note: Whether a column partitioning level defaults to AUTO COMPRESS or NO AUTO COMPRESS depends on the setting of the AutoCompressDefault cost profile. See SQL Request and Transaction Processing for further information about AutoCompressDefault.

    Row Structure for Containers (COLUMN Format)

    Teradata Database packs column partition values into a container up to a system-determined limit and then packs the next set of column partition values into a new container. The column partition values within a container must be in the same combined partition to be packed into that container.

    A column partition represents one or more table columns of a table. A column partition that has COLUMN format is represented as a series of containers that hold the column partition values of the column partition.

    A container consists of a header and column partition values followed by autocompression bits at the end, with free space in between.The free space is allocated in such a way that column partition values and autocompression bits can grow toward each other using the free space without moving around the values and changing the row size.

    A newly constructed container in memory starts at the maximum allowed size and, therefore, a large free space. Once it either fills up or there are no more column partition values to add for the current DML request, the container can be reduced in size as described later, and it is then written to disk. If there are still more column partition values to add for this request, Teradata Database starts a new container in memory.

    When new column partition values arrive to be appended for a subsequent DML request and there is a last container for the combined partition in which the column partition values are to be appended, and subsequently insufficient free space is available for appending the next new column partition value in this last container, and the container was reduced in size when last written as described later, Teradata Database expands the container in memory to its maximum allowable size if that would enable a column partition value to be added. If the container becomes full, it can be reduced in size as described later, and it is then written to disk. The process begins again with a new container.

    Before writing a container row that has reached its maximum size, either because it is a new container or because it is a last container read from disk that was expanded to the maximum size because it ran out free space, when there are no more column partition values to be added to it by a DML request, there is sufficient free space to add more column partition values, but its free space exceeds a system‑determined percentage of its size without the free space, Teradata Database reduces its free space to be within this percentage.

    It is possible that free space could occur in the last container for each combined partition. If there are many combined partitions, the sum of the free space could add up to a great deal of unused disk space. Therefore, it is desirable to keep this total unused space to a small percentage of the table or join index size. However, having a reasonable amount of free space in the last container minimizes the copying of a container to a larger memory area in order to accommodate new incoming column partition values such as, for example, by an array INSERT, a small INSERT … SELECT request, or a large INSERT … SELECT request to a row partitioned column‑partitioned table or join index such that a small number of column partition values are inserted into a combined partition at a time, and most often the container can be written with the same physical row size after the insert operation, which is more efficient than if the physical row size changes.

    When writing a container row at the point where it can no longer hold additional column partition values, necessitating that a new container be started, the free space for the container is reduced to 0 or near zero. The last container does not necessarily need to be written if it were just read and there is insufficient free space to add another column partition value but the remaining free space is small. However, if a container has too much remaining free space, Teradata Database must remove the free space, and the last container row must be written.

    This does not apply to a container for the delete column partition (see“Container Row for the Delete Column Partition” on page 760) and is autocompressed.

    Container Row for the Delete Column Partition

    A container row for the delete column partition has fixed-length, single-column, BYTEINT NOT NULL column partition values. This data is constrained to have the values 0 or 1. The container row has the layout as a fixed‑length container (see “Row Structure for Fixed-Length Containers” on page 760).

    Row Structure for Fixed-Length Containers

    This row structure applies to both fixed‑length single‑column and multicolumn column partitions unless autocompression compresses the data as variable length.

    Offsets in the container row are relative to its beginning.

     

                   Term

                                                             Description

    Row Length

    Length of the row in bytes.

    H0

    First row hash field value for the container.

    H1

    Second row hash field value for the container.

    U0

    First uniqueness field value for the container.

    U1

    Second uniqueness field value for the container.

    Flag Bits

    Two bits indicate whether this container requires 2 bytes or 8 bytes to store its maximum number of combined partitions.

    Presence Bits Array

    Contains autocompression flags in a container.

    Internal Partition Number

    Internal partition number for the container.

  • The field is 2 bytes long if the maximum combined partition number for the container is 65,535.
  • The field is 8 bytes long if the maximum combined partition number for the container is > 65,535.
  • CP Values Count

    Number of column partition values represented by this container.

    Offset to Free Space

    Offset to the first byte of free space.

    Offset to Last AC Bits

    Offset to the last byte of autocompression bits.

    ACT Count

    Count of the number of types of autocompression applied to the container.

    If the autocompression bit is not set in the first presence byte, this field is omitted from the row header.

    ACT1

    First autocompression type applied to this container.

    If the autocompression bit is not set in the first presence byte, this field is omitted from the row header.

    Arg11

    First argument for this autocompression type.

    If the autocompression bit is not set in the first presence byte, this field is omitted from the row header.

    Arg1n

    nth argument for autocompression type 1.

    Arg1a

    ath argument for autocompression type 1.

    ACTt

    tth autocompression type applied to this container.

    Argt1

    First argument for autocompression type t.

    Argta

    ath argument for autocompression type t.

    Offset to LVLC1

    Length of LVLC1 is offset in the next two bytes.

    Offset to LVLC1 if the LVLC bit is set in the first presence byte.

    Offset to LVLCn

    Length of LVLCn is offset in the next two bytes.

    Offset to LVLCn if the LVLC bit is set in the first presence byte.

    Offset to LVLCo

    Length of LVLCo is offset in the next two bytes.

    Offset to LVLCo

    An ACT argument indicates how many column partition values are in the local value list compression dictionary if the LVLC bit is set in the first presence byte.

    o is an argument to an autocompression type that specifies LVLC.

    Offset to CP Values

    Offset to the first column partition value.

    The offset to this field is 2*o + size of the autocompression types and their arguments + the offset to OffsetToLastACBits + 2, where o is an argument of an autocompression type indicating that there is LVLc.

    LVLC1

    First column partition value for the local value list compression dictionary if the LVLC bit is set in the first presence byte.

    LVLCn

    nth column partition value for the local value list compression dictionary if the LVLC bit is set in the first presence byte.

    LVLCa

    Last column partition value for the local value list compression dictionary if the LVLC bit is set in the first presence byte.

    Length of CPValue1

    If a single‑column partition, the maximum length is in the field5 field descriptor.

    If a multicolumn partition, the maximum length is in the multicolumn partition descriptor.

  • The field is 1 byte long if the maximum length is 255.
  • The field is 2 bytes long if the maximum length is > 255.
  • CP Value1

    First present column partition value.

    The field is n bytes long, where n is the number of bytes in CPValue1.

    Length of CPValuen

    If a single‑column partition, the maximum length is in the field5 field descriptor.

    If a multicolumn partition, the maximum length is in the multicolumn partition descriptor.

  • The field is 1 byte long if the maximum length is 255.
  • The field is 2 bytes long if the maximum length is > 255.
  • CP Valuen

    nth present column partition value.

    The field is n bytes long, where n is the number of bytes in CPValue1.

    Length of CPValuev

    If a single‑column partition, the maximum length is in the field5 field descriptor.

    If a multicolumn partition, the maximum length is in the multicolumn partition descriptor.

  • The field is 1 byte long if the maximum length is 255.
  • The field is 2 bytes long if the maximum length is > 255.
  • CP Valuev

    Last present column partition value.

    Free Space

  • 1 if AC bit is set or ACTBD is set and it is a nullable single‑column partition.
  • 2 otherwise.
  • Free AC Bits

  • Only included if the autocompression bit is set in the first presence byte.
  • Otherwise set to 0.
  • AC Bitsj

    Last set of autocompression bits.

    k bits, where k is the number of bits needed for compression a column partition value per the autocompression types and their arguments.

  • Last set of autocompression bits.
  • Only included if the autocompression bit is set in the first presence byte.

  • Otherwise set to 0.
  • AC Bitsk

    kth set of autocompression bits.

    k bits, where k is the number of bits needed for compression a column partition value per the autocompression types and their arguments.

  • More sets of autocompression bits.
  • Only included if the autocompression bit is set in the first presence byte.

  • Otherwise set to 0.
  • AC Bits1

    First set of autocompression bits.

    k bits, where k is the number of bits needed for compression a column partition value per the autocompression types and their arguments.

  • First set of autocompression bits.
  • Only included if the autocompression bit is set in the first presence byte.

  • Otherwise set to 0.
  • Row Structure for Variable-Length Containers

    This row structure applies to both variable‑length single‑column and variable length multicolumn partitions. The same structure is also used for fixed‑length single‑column and multicolumn column partitions if autocompression compresses as variable length for a container. Offsets in a container are relative to its beginning.

    where:

     

                   Term

                                                             Description

    Row Length

    Length of the row in bytes.

    H0

    First row hash field value for the container.

    H1

    Second row hash field value for the container.

    U0

    First uniqueness field value for the container.

    U1

    Second uniqueness field value for the container.

    Flag bits

    Two bits indicate whether this container requires 2 bytes or 8 bytes to store its maximum number of partitions.

    Presence bits array

    Used for autocompression flags in a container row.

    Internal Partition Number

    Internal partition number for the container.

  • The field is 2 bytes long if the maximum combined partition number for the container is 65,535.
  • The field is 8 bytes long if the maximum combined partition number for the container is > 65,535.
  • CP Values Count

    Number of column partition values represented by this container.

    Offset to Free Space

    Offset to the first byte of free space.

    Offset to Last AC Bits

    Offset to the last byte of autocompression bits.

    ACT Count

    Count of the number of types of autocompression applied to the row.

    If the autocompression bit is not set in the first presence byte, this field is omitted from the row header.

    ACT1

    First autocompression type applied to this container.

    If the autocompression bit is not set in the first presence byte, this field is omitted from the row header.

    Arg11

    First argument for this autocompression type.

    If the autocompression type has no arguments, this field is omitted from the row header.

    Arg1n

    nth argument for autocompression type 1.

    Arg1a

    ath argument for autocompression type 1.

    ACTt

    tth autocompression type applied to this container.

    Argt1

    First argument for autocompression type t.

    Argta

    ath argument for autocompression type t.

    Offset to LVLC1

    Length of LVLC1 is offset in next two bytes.

    Offset to LVLC1 if LVLC bit is set in first presence byte.

    Offset to LVLCn

    Length of LVLCn is offset in next two bytes.

    Offset to LVLCn if LVLC bit is set in first presence byte.

    Offset to LVLCo

    Length of LVLCo is Offset in next two bytes.

    Offset to LVLCo.

    An ACT argument indicates how many CPValues are in the local VLC dictionary if the LVLC bit is set in 1st presence byte. o is an arg to an ACT specifying LVLC.

    Offset to CP Values

    Offset to first CPValue.

    The offset to this field is 2*o + size of the ACTs and their arguments + offset to OffsetToLastACBits + 2, where o is an argument of an ACT indicating there is LVLC.

    LVLC1

    First CPValue for local VLC dictionary if LVLC bit is set in the first presence byte.

    LVLCn

    nth CPValue for local VLC dictionary if LVLC bit is set in the first presence byte.

    LVLCo

    Last CPValue for local VLC if LVLC bit is set in the first presence byte.

    o is an argument to an ACT specifying LVLC.

    Length of CPValue1

    If a single‑column partition, the maximum length is in the field5 field descriptor.

    If a multicolumn partition, the maximum length is in the multicolumn partition descriptor.

  • The field is 1 byte long if the maximum length is 255.
  • The field is 2 bytes long if the maximum length is > 255.
  • CPValue1

    First present column partition value.

    The field is n bytes long, where n is the number of bytes in CPValue1.

    Length of CPValuen

    If a single‑column partition, the maximum length is in the field5 field descriptor.

    If a multicolumn partition, the maximum length is in the multicolumn partition descriptor.

  • The field is 1 byte long if the maximum length is 255.
  • The field is 2 bytes long if the maximum length is > 255.
  • CP Valuen

    nth present column partition value.

    The field is n bytes long, where n is the number of bytes in CPValuen.

    Length of CPValuev

    If a single‑column partition, the maximum length is in the field5 field descriptor.

    If a multicolumn partition, the maximum length is in the multicolumn partition descriptor.

  • The field is 1 byte long if the maximum length is 255.
  • The field is 2 bytes long if the maximum length is > 255.
  • CP Valuev

    Last present column partition value.

    Free Space

  • 1 if AC bit is set or ACTBD is set and it is a nullable single‑column partition.
  • 2 otherwise.
  • Free AC Bits

  • Only included if the autocompression bit is set in the first presence byte.
  • Otherwise set to 0.
  • AC Bitsj

    Last set of autocompression bits.

    k bits, where k is the number of bits needed for compressing a column partition value per the autocompression types and their arguments.

  • Last set of autocompression bits.
  • Only included if the autocompression bit is set in the first presence byte.

  • Otherwise set to 0.
  • AC Bitsk

    kth set of autocompression bits.

    k bits, where k is the number of bits needed for compressing a column partition value per the autocompression types and their arguments.

  • More sets of autocompression bits.
  • Only included if the autocompression bit is set in the first presence byte.

  • Otherwise set to 0.
  • AC Bits1

    First set of autocompression bits.

    k bits, where k is the number of bits needed for compressing a column partition value per the autocompression types and their arguments.

  • First set of autocompression bits.
  • Only included if the autocompression bit is set in the first presence byte.

  • Otherwise set to 0.
  • Row Structure for Subrows (ROW Format)

    A column partition that has ROW format is represented as a series of subrows, where each subrow contains a single column partition value of the column partition. Subrows have the same format as regular rows with the following exceptions:

  • A regular row contains all the column values of a table row, while a subrow contains only a subset of the column values of a table row (that is, a column value for each of the columns in the column partition).
  • The internal partition number of the row ID of a regular row does not indicate a column partition because regular rows are not column‑partitioned.
  • The internal partition number of the row ID of a subrow indicates its column partition number.

    Teradata Database applies any user‑specified compression within the subrow for the column partition value.

    Alignment of Containers and Subrows

    Containers and subrows are always packed, even on aligned format systems. The only differences between packed and aligned systems are:

  • The length of a container or subrow is a multiple of 8 on 64-bit aligned system.
  • The length of a container or subrow is a multiple of 2 on a packed system.
  • For a container, the file system adds any extra bytes that are required to make the length of the row even to the freespace, not to the end of the row.

    On a packed system, the length of a subrow can be odd. If so, the file system adds a byte to the end of the subrow.