16.10 - Base Table Row Format - Teradata Database

Teradata Database Design

Product
Teradata Database
Release Number
16.10
Release Date
June 2017
Content Type
User Guide
Publication ID
B035-1094-161K
Language
English (United States)
The row format diagrams and the examples in this chapter portray a system with 64 KB rows. A system with larger, 1 MB rows is not shown.

The structure of Teradata Database base table rows is slightly different for rows in the following rectangular conditions.

  • A table having an nonpartitioned primary index versus a table having a partitioned primary index.
  • A system using byte-packed format referred to as packed64 format versus a system using 64-bit byte-aligned referred to as aligned row format.
  • A table with load isolation versus a table without load isolation. Load isolation allows committed reads during table loads on tables that are created or altered to be load isolated. For more information, see Database Administration.

Row format information is used to make fine-grained row size estimates for capacity planning.

General Row Structure

The following topics describe the structure of a Teradata Database row in systems using packed64 format and systems using aligned row format. Whether a system stores its data in packed64 or aligned row format depends on the settings on several factors your Teradata support personnel can modify. The size of tables on a system that stores data in packed64 format is generally between 3% and 9% smaller than the size of the same tables on a system that stores data in aligned row format (the average difference is roughly 7% smaller for the packe64d row format). Storing data in packed64 format reduces the number of I/O operations required to access and write rows in addition to saving disk space.

Either 12 or 16 bytes of the row header are devoted to overhead, depending on whether the table has a non-partitioned or a partitioned primary index and whether the row format of your system is 1 MB or 64 KB rows. Systems with 64 KB rows have 12 bytes devoted to overhead, and systems with 1 MB rows have 16 bytes devoted to overhead. The maximum row length is 1 MB. This limit is the same for both packed64 and aligned row formats. This maximum length does not include BLOB, CLOB, or XML columns, which are stored in special subtables outside the base table row. See Sizing a LOB or XML Subtable for information about BLOB, CLOB, and XML subtables.

The number of characters that can be represented by this number of bytes varies depending on whether characters are represented by one byte, two bytes, or a combination of single-byte and multibyte representations.

There are three general categories of table columns, which are stored in the row in the order listed:

  1. Fixed length.

    Storage is always allocated in a row for these columns.

  2. Compressible.

    Includes both value-compressed and algorithmically compressed data and can be stored in any order.

    FOR this type of column … Storage is allocated …
    Multivalue-compressible in a row when needed.

    No disk storage space is required when the column contains a compressed value as specified in the CREATE TABLE DDL text.

    If a column value is not compressed, then Teradata Database allocates storage space for it.

    algorithmically-compressible in a row when needed.

    If a column is not multivalue compressed or NULL, Teradata Database allocates storage space for it.

    Algorithmically-compressed data requires less space.

  3. Variable length.

    Storage space is allocated in the row depending on the size of the column.

The structure of Teradata Database base table rows varies slightly for tables having an nonpartitioned primary index versus a partitioned primary index (see Row Structure for Packed64 Systems and Row Structure for Aligned Row Format Systems for details).

For a system using the packed64 format, columns are stored in field ID order, with the fixed length fields first followed by the compressed fields, and last by the variable length fields.

In aligned row format, fixed length columns are stored first, followed by compressed fields, and then by variable length fields. This ordering is the same as the ordering for packed64 format storage with the exception that the columns within each category are stored in decreasing alignment constraint order.

For example, if a row contains fixed length columns with the types CHARACTER, INTEGER and FLOAT, the floating point numbers are stored first, followed by the INTEGER numbers, and then by the CHARACTER columns regardless of the order in which the columns were defined in the CREATE TABLE request defining the table.

Teradata Database uses a space optimization routine for aligned format rows that can store a maximum of 7 bytes of data in the potentially unused space that can occur when a column is aligned on a 0(mod 8) boundary. After the routine fills the available space with candidate date, Teradata Database follows the rules outlined in the preceding paragraphs to complete the remainder of the row data in an aligned format row.

Sector Alignment

The Teradata File System supports devices that use either a native 512 byte sector size or a native 4KB sector size.

The new 4KB disk drives can be separated into 2 classes: those that support I/O on non-4KB aligned disk boundaries and those that do not support non-4KB aligned I/O. When operating on system with 4KB drives, Teradata Database only performs I/Os on blocks that are full 4KB aligned (in size and length), regardless of the class of the drive.

Teradata Database only supports a homogeneous alignment configuration throughout the entire system. On a given system, data on all storage devices is aligned or unaligned, and cannot be both. This also applies across cliques.

General Row Structure When Compressing Variable Length Columns

The following information describes aspects of general row structure when one or more variable length columns in a table are compressed. The description applies to columns having any of the following data types:

  • VARBYTE
  • VARCHAR
  • VARCHAR(n) CHARACTER SET GRAPHIC
  • UDT
  • CLOB/BLOB

The description assumes that all variable length compressible columns are treated as an extension of fixed length compressible columns except that decompressed variable length columns store both a length and the actual column data. There is an additional presence bit for an algorithmically compressed column to indicate whether the column data is compressed or not. Teradata Database sets this bit only when data in a column is compressed using algorithmic compression.

When column data is compressed algorithmically, Teradata Database stores it as length: data pairs interleaved with the other compressible columns in the table. When column data is null, Teradata Database does not store a length, so there is no overhead in that case.

The following summary information applies to general row structure elements for the storage of variable length column data for multivalue compression.

  • When compression does not apply to a value in a column that specifies multivalue compression, Teradata Database stores the column data in the row as a length and data value pair.

    If the length of the column data is 255 bytes, then Teradata Database stores the length in 1 byte; otherwise it stores the length in 2 bytes.

    A variable length column (without compression) always has a 2-byte offset associated with it, whereas a compressed variable length column can have its length stored in one or two bytes, depending on whether the column length is 255 or not, respectively.

  • Teradata Database stores uncompressed variable length data in the row as a length and data value pair.

    The stored length for multivalue compressed data is the length of the uncompressed column value.

  • If a variable or fixed length multivalue compressed column is compressed or is null, then its values are not stored in the row.

    Instead, Teradata Database stores one instance of each compressed value within a column in the table header and references it using the presence bits array for the row.

  • If a fixed length multivalue compressed column is not compressed, then Teradata Database stores its values in the row as data, but without an accompanying length. No length is needed because Teradata Database pads fixed length data values to the maximum length defined for their containing column.

The following summary information applies to general row structure elements for the storage of variable length column data for algorithmic compression.

  • If a variable length column is not compressed, then Teradata Database stores its values in the row as a length and data value pair.
  • If a fixed length column is not compressed, then Teradata Database stores its values in the row as data, but without an accompanying length.

    Teradata Database pads fixed length data values to the maximum length defined for their containing column.

  • For an algorithmically-compressed column with a variable length, Teradata Database stores its values in the row as a length and data value pair.

    The stored length for algorithmically-compressed data is the length of the compressed column value, not its original, uncompressed length.

    If the length of the column data is 255 bytes, then Teradata Database stores the length in 1 byte; otherwise, it stores the length in 2 bytes.

  • For an algorithmically-compressed column with either a variable length or a fixed length, Teradata Database does not store a representation of its data in the row if the column is null, but does indicate its existence in the data using the presence bit array for the row.

The following sets of examples demonstrate the specific effects on row structure for multivalue compression alone (cases 1, 2, and 3 of example 1), algorithmic compression alone (cases 1, 2, and 3 of example 2), and combined multivalue and algorithmic compression (cases 1, 2, and 3 of example 3). In each case, the row structure diagram is based on the structure for a partitioned table (seePacked64 Row Structure for a Partitioned Table ) even though the examples are all for nonpartitioned primary index tables.

Example 1: Algorithmic Compression But No Multivalue Compression

Suppose you define the following table to use Huffman encoding algorithms to compress and decompress two of its columns algorithmically. This table specifies algorithmic compression on columns alc1 and alc2, but no multivalue compression. Particularly relevant data for each case is highlighted in boldface type.

     CREATE TABLE t1 (
       fc1  INTEGER,
       alc1 VARCHAR(10) COMPRESS ALGCOMPRESS huffcomp
                                 ALGDECOMPRESS huffdecomp,
       vc1  VARCHAR(20),
       alc2 VARCHAR(10) COMPRESS ALGCOMPRESS huffcomp
                                 ALGDECOMPRESS huffdecomp,
       vc2  VARCHAR(20) );

Field 5 of the table header for this table has the following field descriptors.

                         UDT or
              Num-  Next Com-                   Pres-
In- Fld  Off- Comp- Fld  press  Calc   Sto-     ence     Sort
dex id   set  ress  Ind  offset offset rage     Byte Bit Desc   EvlRepr
--- ---- ---- ----- ---  ------ ------ -------- ---- --- ------ ------------------
 1  1025  20         2     0    Offset Nullable   0   1  AscKey DBC INTEGER
 2  1026  -          4    196   Comprs alc+Null1  0   2  NonKey DBC VARCHAR(10) LATIN
 3  1027  12         5     0    Var    Nullable   0   4  NonKey DBC VARCHAR(20) LATIN
 4  1028  -          3    206   Comprs alc+Null1  0   5  NonKey DBC VARCHAR(10) LATIN
 5  1029  14         0     0    Var    Nullable   0   7  NonKey DBC VARCHAR(20) LATIN

To be as general as possible, each row structure diagram indicates the configuration of fields as they would be if the table had a partitioned primary index, though none of the actual table creation SQL text specifies a PPI. The diagrams also assume the system has a Packed64 row structure.

Case 1

Suppose you have the same table, but without any algorithmic compression being defined. The row structure for this table looks something like the following diagram.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation Definition
fc1 Uncompressed data for column fc1, a fixed length data type.
len1 Length of the data in column alc1, an algorithmically-compressed variable length data type.
alc1 Algorithmically-compressed data for column alc1, a variable length data type.
len2 Length of the data in column alc2, an algorithmically- compressed variable length data type.
alc2 Algorithmically-compressed data for column alc2, a variable length data type.
vc1 Uncompressed data for column vc1, a variable-length data type.
vc2 Uncompressed data for column vc2, a variable-length data type.

In this case, the data values for columns alc1 and alc2 are both stored in the row and are not compressed.

Teradata Database has placed the variable length compressible columns after the fixed length columns in the compressible columns area of the row. This is also reflected in the Field5 descriptor. Because the columns are not compressed, Teradata Database stores lengths with the two alc columns. len1 and len2 contain the uncompressed lengths of those columns, and the ALC bit is not set.

Case 2

In the next case, algorithmic compression has been defined, and the data for columns alc1 and alc2 is not null. alc1 and alc2 are present in the row and compressed. len1 and len2 contain the new compressed length. The ALC bit has been set for this case to indicate that the data in columns alc1 and alc2 is compressed.

For this case, the row structure for the table superficially looks exactly like the row structure diagram for the previous case.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation Definition
fc1 Uncompressed data for column fc1, a fixed length data type.
len1 Length of the data in column alc1, an algorithmically-compressed variable length data type.
alc1 Algorithmically-compressed data for column alc1, a variable length data type.
len2 Length of the data in column alc2, an algorithmically-compressed variable length data type.
alc2 Algorithmically-compressed data for column alc2, a variable length data type.
vc1 Uncompressed data for column vc1, a variable-length data type.
vc2 Uncompressed data for column vc2, a variable-length data type.

Case 3

In the next case, algorithmic compression has been defined, but the data for columns alc1 and alc2 is null. alc1 and alc2 are present in the row and compressed. len1 and len2 contain the new compressed length. The ALC bit is not set for this case because column alc1 is null.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation Definition
fc1 Uncompressed data for column fc1, a fixed-length data type.
len1 Not represented.

This column is null, so no length for column alc2 is stored in the row.

alc1 Not represented.

This column is null, so no value for alc1 is stored in the row.

len2 Length of the data in column alc2, an algorithmically-compressed variable length data type.
alc2 Algorithmically-compressed data for column alc2, a variable length data type.
vc1 Uncompressed data for column vc1, a variable-length data type.
vc2 Uncompressed data for column vc2, a variable-length data type.

Example 2: Multivalue Compression But No Algorithmic Compression

This example presents row structure cases based on the following table definition. This table has multivalue compression defined on columns mvc1 and mvc2, but no algorithmic compression. Particularly relevant data for each case is highlighted in boldface type.

     CREATE TABLE t1 (
       fc1  INTEGER,
       mvc1 VARCHAR(10) COMPRESS (‘mars’,’saturn’,’jupiter’),
       vc1  VARCHAR(20),
       mvc2 VARCHAR(10) COMPRESS (‘Germany’,’France’,’England’),
       vc2  VARCHAR(20) );

Field 5 of the table header for this table has the following field descriptors.

                         UDT or
              Num-  Next Com-                   Pres-
In- Fld  Off- Comp- Fld  press  Calc   Sto-     ence     Sort
dex id   set  ress  Ind  offset offset rage     Byte Bit Desc   EvlRepr
--- ---- ---- ----- ---  ------ ------ -------- ---- --- ------ ------------------
 1  1025  20         2     0    Offset Nullable   0   1  AscKey DBC INTEGER
 2  1026  -          4    196   Comprs cmp+Null2  0   2  NonKey DBC VARCHAR(10) LATIN
 3  1027  14         5     0    Var    Nullable   0   5  NonKey DBC VARCHAR(20) LATIN
 4  1028  -          3    206   Comprs cmp+Null1  0   6  NonKey DBC VARCHAR(10) LATIN
 5  1029  16         0     0    Var    Nullable   0   1  NonKey DBC VARCHAR(20) LATIN

Case 1

In this case, the data values for columns mvc1 and mvc2 are both stored in the row because they are not compressed for the particular data values they contain.

Teradata Database has placed the variable length compressible columns after the fixed length columns in the compressible columns area of the row. This is also reflected in the Field5 descriptor. len1 and len2 contain the uncompressed lengths of those columns.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation Definition
fc1 Uncompressed data for column fc1, a fixed-length data type.
len1 Compressed length of the data in column mvc1, a multivalue compressed variable length column.
mvc1 Multivalue compressed data for column mvc1.
len2 Compressed length of the data in column mvc2, a multivalue-compressed variable length column.
mvc2 Multivalue compressed data for column mvc2.
vc1 Uncompressed data for column vc1, a variable-length data type.
vc2 Uncompressed data for column vc2, a variable-length data type.

Case 2

The next case demonstrates row storage when the column data in the row is to be multivalue compressed. In this case, the value for column mvc1 is not stored in the row because it is compressed. The value is instead stored in the table header and referenced by the presence bits array for the row.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation                                                                Definition
fc1 Uncompressed data for column fc1, a fixed-length data type.
len1 Not represented.

This column is null, so no length for column mvc1 is stored in the row.

mvc1 Not represented.

This column is null, so no value for mvc1 is stored in the row.

len2 Compressed length of the data in column mvc2, a multivalue-compressed variable-length column.
mvc2 Multivalue compressed data for column mvc2.
vc1 Uncompressed data for column vc1, a variable-length data type.
vc2 Uncompressed data for column vc2, an uncompressed variable-length data type.

Case 3

The next case demonstrates row storage when the column data for mvc1 is null. In this case, the value for column mvc1 is not stored in the row because nulls are always compressed. This demonstrates that nulls never incur a cost overhead for compression.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation Definition
fc1 Uncompressed data for column fc1, a fixed-length data type.
len1 Not represented.

This column is null, so no value for mvc1 is stored in the row.

mvc1 Not represented.

This column is null, so no value for mvc1 is stored in the row.

len2 Compressed length of the data in column mvc2, a multivalue compressed variable-length column.
mvc2 Multivalue compressed data for column mvc2.
vc1 Uncompressed data for column vc1, a variable-length data type.
vc2 Uncompressed data for column vc2, a variable-length data type.

Example 3: Mix of Multivalue and Algorithmic Compression

This example presents row structure cases based on the following table definition. This table has multivalue compression defined on columns mvc_f1 and mvc_v1, and algorithmic compression defined on column alc1. Particularly relevant data for each case is highlighted in boldface type.

     CREATE TABLE t1(
       fc1    INTEGER,
       alc1   VARCHAR(10) COMPRESS ALGCOMPRESS huffcomp
                                   ALGDECOMPRESS huffdecomp,
       vc1    VARCHAR(20),
       mvc_f1 CHARACTER(10) COMPRESS (‘Germany’,’France’,’England’),
       mvc_v1 VARCHAR(20) COMPRESS (‘Nike’,’Reebok’,’Adidas’) );

Field 5 of the table header for this table has the following field descriptors.

                         UDT or
              Num-  Next Com-                   Pres-
In- Fld  Off- Comp- Fld  press  Calc   Sto-     ence     Sort
dex id   set  ress  Ind  offset offset rage     Byte Bit Desc   EvlRepr
--- ---- ---- ----- ---  ------ ------ -------- ---- --- ------ ------------------
 1  1025  20         2     0    Offset Nullable   0   1  AscKey DBC INTEGER
 2  1026  -          4    196   Comprs alc+Null1  0   2  NonKey DBC VARCHAR(10) LATIN
 3  1027  14         0     0    Var    Nullable   0   5  NonKey DBC VARCHAR(20) LATIN
 4  1028  -          5    226   Comprs cmp+Null2  0   6  NonKey DBC VARCHAR(10) LATIN
 5  1029  -          3    256   Comprs cmp+Null2  1   1  NonKey DBC VARCHAR(20) LATIN

Case 1

In this case, no compression has been specified for the particular data values, so data for columns alc1, mvc_f1, and mvc_v1 is stored in the row.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation Definition
fc1 Uncompressed data for column fc1, a fixed length data type.
len1 Compressed length of the data in column alc1, an algorithmically-compressed variable length data type.
alc1 Algorithmically-compressed data for the variable length column alc1.
mvc_f1 Multivalue compressed data for the fixed length column mvc_f1.
len2 Length of the data in column mvc_f1, a multivalue compressed variable length data type.
mvc_v1 Multivalue compressed data for the variable length column mvc_v1.
vc1 Uncompressed data for column vc1, a variable length data type.

Teradata Database has placed the variable compressible columns after the fixed field columns in the compressible columns area. This is also reflected in the Field5 descriptor. The compressed column data is interleaved with other compressible fields. len1 and len2 contain the uncompressed lengths of alc1 and mvc_v1, respectively.

Case 2

In this case, Teradata Database compresses the data for columns mvc_v1 and alc1 because the values for the row are specified in the multivalue for those columns.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation                                                              Definition
fc1 Uncompressed data for column fc1, a fixed length data type.
len1 Compressed length of the data in column alc1, an algorithmically-compressed variable length data type.
alc1 Algorithmically-compressed data for column alc1, a variable length data type.
mvc_f1 Multivalue compressed data for column mvc_f1, a fixed length data type.
mvc_v1 Not represented.

This column is compressed, so no value for mvc1 is stored in the row.

vc1 Uncompressed data for column vc1, a variable length data type.

In this case, Teradata Database does not store the value for mvc_v1 in the row because it is compressed for the particular data value, so the value is stored in the table header. len1 contains the compressed length for alc1.

Case 3

In this case, the data for the compressible columns mvc_v1 and alc1 is null, so Teradata Database does not store values for those columns in the row. This example demonstrates that nulls never incur a cost overhead for compression.



The following abbreviations are used for the various fields of the diagram for this case.

Abbreviation Definition
fc1 Uncompressed data for column fc1, a fixed length data type.
len1 Not represented.

This column is null, so no value for alc1 is stored in the row.

alc1 Not represented.

This column is null, so no value for alc1 is stored in the row.

mvc_f1 Multivalue compressed data for column mvc_f1, a fixed length data type.
len2 Not represented.

This column is null, so no value for mvc_v1 is stored in the row.

mvc_v1 Not represented.

This column is null, so no value for mvc_v1 is stored in the row.

vc1 Uncompressed data for column vc1, a variable length data type.

Row Structure for Packed64 Systems

Base table rows are stored in packed format on packed64 format systems, so they need not align on 8-byte boundaries. Because of this, their row structure is simpler than that of equivalent base table rows on aligned row format systems (see Row Structure for Aligned Row Format Systems) below.

Note that the Row Hash value is 4 bytes wide irrespective of the number of hash buckets the system has.

Packed64 Row Structure for a Nonpartitioned Primary Index Table with Load Isolation

The following graphic illustrates the basic structure of a Teradata Database row from a table on a Packed64 format system with an nonpartitioned, or traditional, primary index. The table is load isolated, so the RowLoadID field is present.



Packed64 Row Structure for a Partitioned Table

The following graphic illustrates the basic structure of a Teradata Database row from a table on a packed64 format system with a partitioned primary index:



The difference between this and the format of a nonpartitioned primary index row is the presence of a an additional 2-byte or 8-byte partition number field, which is also a component of the RowID (partitioned table rows are an additional 4 bytes wider if they also specify multivalue compression). It is this field that generates the need for a BYTE(10) data type specification for a RowID. For nonpartitioned primary index tables, the partition number is assumed to be 0, so the rowID of an nonpartitioned primary index table is also logically BYTE(10) (see ROWID Columns).

Row Structure for Aligned Row Format Systems

The row structure diagrams for aligned row format systems differs from those of packed64 systems by having five additional pad byte fields to ensure row alignment on an 8-byte boundary. The following table lists each of these pad fields and explains their purpose:

Pad Byte Field Name Purpose
VARCHAR Offsets Array Alignment Pad Bytes Aligns VARCHAR offsets array at a 2-byte boundary.
Fixed Length Column Alignment Pad Bytes Aligns fixed length columns.
Compressible Length Column Alignment Pad Bytes Aligns value compressible length columns.
VARCHAR Column Alignment Pad Bytes Aligns variable length columns.
Trailing Pad Bytes Aligns entire row on an 8-byte boundary.

Note that the Row Hash value is 4 bytes wide irrespective of the number of hash buckets the system has.

Aligned Row Structure for a Nonpartitioned Table with Load Isolation

The following graphic illustrates the basic structure of a Teradata Database row from a table on an aligned row format system with a nonpartitioned, or traditional, primary index. The table is load isolated, so it has the RowLoadID field.



Aligned Row Structure for a Partitioned Table With 65,535 or Fewer Combined Partitions

The following graphic illustrates the basic structure of a Teradata Database row from a table on an aligned row format system with a partitioned primary index that has 65,535 or fewer combined partitions:



The difference between this and the format of an nonpartitioned table row is the presence of a 2-byte or 8-byte partition number field, which is also a component of the RowID (partitioned table rows are an additional 4 bytes wider if they also specify multivalue compression. It is this field that generates the need for a BYTE(10) data type specification for a RowID. For nonpartitioned primary index tables, the partition number is assumed to be 0, so the rowID of an nonpartitioned primary index table is also logically BYTE(10) (see ROWID Columns).

Containers and Subrows

The data in a column-partitioned table can be stored in containers or subrows. The following table defines these terms.

Containers and Space

If Teradata Database can pack many column partition values into a container, this form of compression, called row header compression, can reduce the space needed for a column-partitioned table or join index compared to the same object without column partitioning.

If Teradata Database can place only a few column partition values in a container because of their width, there can actually be a small increase in the space needed for a column-partitioned table or join index compared to the same object without column partitioning. In this case, ROW format may be more appropriate.

If there are only a few column partition values (because it is possible with row partitioning that only a few column partition values occur for each combined partition) and there are many column partitions, there can be a very large increase in the space needed for a column-partitioned object compared to the same object without column partitioning. In the worst case, the space required for a table can increase by nearly 24 times.

In this case, consider making one of the following changes:

  • Alter the column or row partitioning to allow for more column partition values per combined partition.
  • Remove the column partitioning from the table or join index.

Container Contents

A container must have the same internal partition number and hash bucket for all the column partition values in that container row.

Following are the contents of a container:

  • The row header for a container indicates its internal partition number, hash bucket, and uniqueness value for the first column partition value. The row header is the same as for any other physical row, including physical row length, a rowID, flag byte, and first presence byte. The row header for a container is either 14 or 20 bytes long.

    There is only one row header for a container, using the rowID of the first column partition value as the rowID of the container instead of there being a row header for each column partition value as is the case for all other row types in Teradata Database. You can determine the rowID of a column partition value by its position within the container.

    Note that the first presence byte for a container is not used as a presence byte.

  • The first presence byte for a container is not used as a presence byte. It indicates whether autocompression types have been determined for the container.

    If the selected autocompression types, which might include applying user-specified compression, do not reduce the size of the container row, the AC bit in the first presence byte in the row header is set to 0 and the container row is not autocompressed.

  • The number of column partition values, including values for logically deleted rows, is represented by the container and various offsets to its sections.
  • An optional series of column partition values for the local value list compression dictionary.

    This is preceded by arrays of offsets if the values are variable length.

  • A series of fixed-length column partition values or a series of variable-length column partition values, each prefixed by a 1-byte or 2-byte length.

    The series can be empty if the autocompression bits do not indicate that any column partition values are present. If this occurs, the column partition values have all been compressed.

  • 0 or more bytes of free space.
  • Optional autocompression presence bits, value length compression bits, algorithmic compression bits, and run length bits, depending on the autocompression techniques used in the container row in reverse order of the series of values.

    A column container should have thousands of column values for fixed length and short variable-length data types unless the table is overly row-partitioned. The values and presence bits grow closer to one another as they consume the available free space. If there is insufficient free space, Teradata Database expands the row.

For a single-column partition that has COLUMN format (that is, its physical rows are containers), a column partition value is the same as a column value and can have either a fixed or a variable length.

For a multicolumn partition with COLUMN format, a column partition value has a structure similar to a regular row, containing presence and compression bits as 0 or more bytes, offsets to variable-length column values, the values of its fixed-length columns, the values of its uncompressed columns, and the values of its variable-length columns.

A container does not include:

  • A row length because the length of a column partition value is handled separately
  • An internal partition number
  • A hash bucket
  • A uniqueness value
  • Presence or compression bits
  • First presence bit

Teradata Database applies user-specified compression within the multicolumn column partition value and might apply autocompression to the column partition value as a whole.

Column partition values for a single-column or multicolumn column partition can have either fixed or variable length. If the column partition value has variable length, the length is not part of the column partition value. Instead, the length is specified in the container row by a preceding length field or by using the difference between offsets if the column partition value is in the local value-list dictionary.

A container includes other information such as offsets to the beginning of the series of column partition values.

Containers and Autocompression

A container with autocompression includes:

  • 2 bytes are used as an offset to the compression bits.
  • 1 or more bytes indicate the autocompression types and their arguments for the container.
  • 1 or more bytes of autocompression bits, depending on the number of column partition values and the autocompression type.
  • 0 or more bytes are used for a local value-list dictionary.
  • 0 or more bytes are used for present column partition values.

A container without autocompression uses 0 or more bytes for present column partition values. A container can exist without autocompression because:

  • You specified NO AUTO COMPRESS for the column partition when you created the table or join index.
  • No autocompression types are applicable for the column partition values of the container.
Whether a column partitioning level defaults to AUTO COMPRESS or NO AUTO COMPRESS depends on the setting of the AutoCompressDefault cost profile. See SQL Request and Transaction Processing for further information about AutoCompressDefault.

Row Structure for Containers (COLUMN Format)

Teradata Database packs column partition values into a container up to a system-determined limit and then packs the next set of column partition values into a new container. The column partition values within a container must be in the same combined partition to be packed into that container.

A column partition represents one or more table columns of a table. A column partition that has COLUMN format is represented as a series of containers that hold the column partition values of the column partition.

A container consists of a header and column partition values followed by autocompression bits at the end, with free space in between.The free space is allocated in such a way that column partition values and autocompression bits can grow toward each other using the free space without moving around the values and changing the row size.

A newly constructed container in memory starts at the maximum allowed size and, therefore, a large free space. Once it either fills up or there are no more column partition values to add for the current DML request, the container can be reduced in size as described later, and it is then written to disk. If there are still more column partition values to add for this request, Teradata Database starts a new container in memory.

When new column partition values arrive to be appended for a subsequent DML request and there is a last container for the combined partition in which the column partition values are to be appended, and subsequently insufficient free space is available for appending the next new column partition value in this last container, and the container was reduced in size when last written as described later, Teradata Database expands the container in memory to its maximum allowable size if that would enable a column partition value to be added. If the container becomes full, it can be reduced in size as described later, and it is then written to disk. The process begins again with a new container.

Before writing a container that has reached its maximum size, either because it is a new container or because it is a last container read from disk that was expanded to the maximum size because it ran out free space, when there are no more column partition values to be added to it by a DML request, there is sufficient free space to add more column partition values, but its free space exceeds a system-determined percentage of its size without the free space, Teradata Database reduces its free space to be within this percentage.

It is possible that free space could occur in the last container for each combined partition. If there are many combined partitions, the sum of the free space could add up to a great deal of unused disk space. Therefore, it is desirable to keep this total unused space to a small percentage of the table or join index size. However, having a reasonable amount of free space in the last container minimizes the copying of a container to a larger memory area in order to accommodate new incoming column partition values such as, for example, by an array INSERT, a small INSERT … SELECT request, or a large INSERT … SELECT request to a row partitioned column-partitioned table or join index such that a small number of column partition values are inserted into a combined partition at a time, and most often the container can be written with the same physical row size after the insert operation, which is more efficient than if the physical row size changes.

When writing a container at the point where it can no longer hold additional column partition values, necessitating that a new container be started, the free space for the container is reduced to 0 or near zero. The last container does not necessarily need to be written if it were just read and there is insufficient free space to add another column partition value but the remaining free space is small. However, if a container has too much remaining free space, Teradata Database must remove the free space, and the last container row must be written.

This does not apply to a container for the delete column partition and is autocompressed.

Container for the Delete Column Partition

A container for the delete column partition has fixed-length, single-column, BYTEINT NOT NULL column partition values. This data is constrained to have the values 0 or 1. The container row has the layout as a fixed-length container.

Row Structure for Subrows (ROW Format)

A column partition that has ROW format is represented as a series of subrows, where each subrow contains a single column partition value of the column partition. Subrows have the same format as regular rows with the following exceptions:

  • A regular row contains all the column values of a table row, while a subrow contains only a subset of the column values of a table row (that is, a column value for each of the columns in the column partition).
  • The internal partition number of the row ID of a regular row does not indicate a column partition because regular rows are not column-partitioned.

    The internal partition number of the row ID of a subrow indicates its column partition number.

Teradata Database applies any user-specified compression within the subrow for the column partition value.

Alignment of Containers and Subrows

Containers and subrows are always packed, even on aligned format systems. The only differences between packed and aligned systems are:

  • The length of a container or subrow is a multiple of 8 on 64-bit aligned system.
  • The length of a container or subrow is a multiple of 2 on a packed system.

For a container, the file system adds any extra bytes that are required to make the length of the row even to the freespace, not to the end of the row.

On a packed system, the length of a subrow can be odd. If so, the file system adds a byte to the end of the subrow.

Column Partitioning

Column partitioning is a form of partitioning for multiset tables and for single-table, non-aggregate, non-compressed join indexes. Columnar storage stores the data into a series of containers with usually many values of the column partition packed into each container; alternatively, the values of a column partition can be stored into a series of subrows with one value of the column partition per subrow.

Column partitioning enables sets of table or join index columns to be stored in separate partitions. Row partitioning of primary-indexed tables also enables sets of rows to be stored in separate partitions. Teradata Columnar makes it possible for a table or join index to be column-partitioned, or both column-partitioned and row-partitioned by using multilevel partitioning.

Column partitioning enables the Optimizer to devise efficient searches by using column and row partition elimination based on the columns that are needed by a query. If a table or index column is not needed by a request, the column partition with that column need not be read. If multiple columns are needed for a request, the query plan devised by the Optimizer includes putting projected column values from selected table rows together to form result rows. This can be combined with row partition elimination to further reduce the data that must be accessed to satisfy a request.

Teradata Database can apply various compression techniques to column-partitioned data that can reduce the storage requirements for a table or join index, which can then reduce the I/O requirements for DML requests. When column partitioning is combined with row partitioning, the number of compression opportunities available to the system can increase.

Consumption of Disk Space by Populated and Empty Partitions

With the large number of partitions that can be defined for a table or join index, it is very likely that a high percentage of those partitions are empty at any given time. For example, a table on a 200 AMP system that defines 100,000 combined partitions with 100 rows per unaligned (127.5 KB) data block and 100 data blocks per each combined partition per AMP has 200 billion rows. This is a relatively small number of combined partitions when you consider that the maximum for a table or join index is 9,223,372,036,854,775,807 combined partitions.

If each row were 100 bytes in length, the primary data alone consumes 20 petabytes of disk. That is 20 x 1015 bytes. It is highly unlikely that every combined partition would be populated. When you consider a multidimensional use of multilevel partitioning, you can easily deduce that not all combinations of dimension values actually occur.

In general, a populated combined partition should have either many data blocks per AMP or no data blocks. For the example proposed in the first paragraph, if there is actually only 200 gigabytes of data and each populated combined partition had 100 data blocks per AMP, about 99% of the combined partitions are empty.

Byte Alignment

Rows on packed64 format systems are always aligned on even-byte boundaries. In other words, rows are never stored with an odd number of bytes. As a result, if a row has an odd byte length, the system adds a filler byte to the end of the row to make its length even. Filler bytes are included in the CurrentPerm total for each table column in DBC.DatabaseSpace.

During a SysInit operation, Teradata Database reads several flags to determine whether a system should be configured to format its data in a packed64 format or in an aligned row format. Cons