Hash and Join Index Row Structures | Teradata Vantage - Hash and Join Index Row Structures - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
kko1591750222108.ditamap
dita:ditavalPath
kko1591750222108.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantage™

The row structure for non-row compressed hash and join indexes is the same as that for a standard data table (with the exception that you cannot compress individual column values for hash indexes (which means that the presence bit array of a hash index indicates only the nullability of a column, not its compressibility), while the row structure for row compressed hash and join indexes is more like that of a secondary index (see Sizing a Unique Secondary Index Subtable and Sizing a Nonunique Secondary Index Subtable).

Other than that exception, the only difference is the specific columns included in the base table data columns of the row. For all hash indexes, Vantage automatically adds the primary index columns of the base table (if not explicitly specified in the hash index definition) and its row uniqueness value. The system does not automatically add this information to join indexes.

Hash indexes are typically row compressed by default (see Compression of Hash Index Rows for details), while you must explicitly specify row compression in a CREATE JOIN INDEX request to compress rows in the join index it creates.

Note that you cannot row compress either of the following hash and join index types:
  • A hash index defined with a PPI.

    You cannot define any hash index with a partitioned primary index.

  • A join index defined with a PPI or with column partitioning.

Hash and Join Index Row Structure for Packed64 Format Systems

Hash and join index rows on packed64 format systems do not have to align on 8-byte boundaries. Because of this, their row structure is simpler than that of equivalent base table rows on aligned row format systems (see Row Structure for Aligned Row Format Systems).

Packed64 Row Structure for an Uncompressed Hash or Join Index With an Nonpartitioned Primary Index

The following graphic illustrates the basic structure of a non-row compressed hash or join index row from an index defined with an nonpartitioned primary index.


Basic structure of non-row compressed index row

Packed64 Row Structure for an Uncompressed Load-Isolated Join Index With a Partitioned Primary Index

The following graphic illustrates the basic structure of an non-row compressed join index row (hash indexes cannot have a partitioned primary index) from an index defined with a partitioned primary index.


Structure of non-row compressed join index row

The difference between this and the format of a non-partitioned primary index row is the presence of a 2-byte partition number field at the beginning of the RowID. It is this field that generates the need for a BYTE(10) data type specification for a RowID. For nonpartitioned primary index indexes, the partition number is assumed by default to be 0, and is not stored.

For more information on join indexes and load isolation, see Rules for Using Join Indexes on Load-Isolated Tables.

Packed64 Row Structure for a Row Compressed Hash or Join Index With an Nonpartitioned Primary Index

The region labeled as Index Value in the graphic of the row format is the column_1 value set for the row. It is an abbreviation for the various data type orderings and offsets shown in detail in the row format diagrams for non-row compressed hash and join index rows.

For example, consider the following example CREATE JOIN INDEX text:

CREATE JOIN INDEX test AS
  SELECT (a,b), (x,y)
  FROM table1, table2;

The system packs columns a and b in the last subfield of Field1 of a row compressed join index row the same way that index keys (an index key is the set of values stored as “the index” in a secondary index) are packed in the Field1 of a secondary index row (columns a and b are stored in the area labeled as Index Value in the row format graphic).

The system also stores the uncompressed index values (each instance of those (x,y) values that has the same (a,b) values as a minirow in Field2) in the area labeled as Index Minirow List in the row format graphic.

Each minirow consists of a BYTE length field, an optional presence bits array field, and a column_2_name value set as described above in Packed64 Row Structure for an Uncompressed Hash or Join Index With an Nonpartioned Primary Index, or Packed64 Row Structure for an Uncompressed Load-Isolated Join Index With a Partitioned Primary Index.


Uncompressed index values

Packed64 Row Structure for a Hash or Compressed Join Index With a Partitioned Primary Index

Vantage does not support PPIs for hash indexes, compressed join indexes, or column-partitioned join indexes.

Hash and Join Index Row Structure for Aligned Row Format Systems

The hash and join index row structure diagrams for aligned row format systems differ from those of packed64 format systems by having as many as five additional pad byte fields (depending on the data types of the columns defined for Index Value) to ensure row alignment on an 8-byte boundary. The following table lists each of these pad fields and explains their purpose.

                     Pad Byte Field Name                                  Purpose
VARCHAR Offsets Array Alignment Pad Bytes Aligns VARCHAR offsets array at a 2-byte boundary.
Fixed Length Column Alignment Pad Bytes Aligns fixed length columns.
Nullable Length Column Alignment Pad Bytes Aligns nullable length columns.
VARCHAR Column Alignment Pad Bytes Aligns variable length columns.
Trailing Pad Bytes Aligns entire row on an 8-byte boundary.

The index columns in a row compressed hash or join index row are stored in packed64 format, adjusted for aligned row format systems by a field of alignment bytes trailing field1 and another field of alignment bytes trailing field2 (if necessary to make the entire row align on a modulo(8) boundary). When you row compress a hash or join index, the column_2_name value set for each row in the index is stored as a minirow. Each minirow consists of a byte length field, an optional presence bits array field, and the column_2_name value set for the minirow.

Vantage does not support PPIs for hash indexes, row-compressed join indexes, or column-partitioned join indexes.

Aligned Row Format Structure for an Uncompressed Hash or Join Index With an Nonpartitioned Primary Index

The following graphic illustrates the basic structure of a database row from a non-row compressed hash or join index with an nonpartitioned, or standard, primary index:


""

Aligned Row Format Structure for an Uncompressed Load-Isolated Join Index With a Row-Partitioned Primary Index and 65,535 or Fewer Combined Partitions

Vantage does not support PPIs for hash indexes.

The following graphic illustrates the basic structure of a database row from a non-row compressed load-isolated join index (hash indexes cannot have partitioned primary indexes) with a row-partitioned primary index (and no column partitioning):


""

The difference between this and the format of a non-row compressed nonpartitioned primary index row is the presence of a 2-byte partition number field at the beginning of the RowID. It is this field that generates the need for a BYTE(10) data type specification for a RowID. For nonpartitioned primary index tables, the partition number is assumed to be 0, so the rowID of an nonpartitioned primary index table is also logically BYTE(10) (see ROWID Columns).

For more information on join indexes and load isolation, see Rules for Using Join Indexes on Load-Isolated Tables.

Aligned Row Format Structure for an Uncompressed Join Index With a Partitioned Primary Index and More Than 65,535 Combined Partitions

Vantage does not support PPIs for hash indexes.

The following graphic illustrates the basic structure of a database row from a non-row compressed join index (hash indexes cannot have partitioned primary indexes) with a row-partitioned primary index (without column partitioning):


""

The difference between this and the format of a non-row compressed non-partitioned primary index row is the presence of an 8-byte partition number field at the beginning of the RowID. It is this field that generates the need for a BYTE(16) data type specification for a RowID.

The figure shows a row in a join index table without load isolation. If the join index is load isolated, an 8-byte RowLoadID is added after the VARCHAR offsets.

Aligned Row Format Join Index Row Layout for a Compressed Hash or Join Index With an Nonpartitioned Primary Index

Consider the following example CREATE JOIN INDEX text:

CREATE JOIN INDEX test AS
  SELECT (a,b), (x,y)
  FROM table1, table2;

Vantage packs columns a and b, labeled as the column_1_name list in the CREATE JOIN INDEX syntax diagram, in the last subfield of Field1 of a row compressed join index row the same way that index keys (an index key is the set of values stored as “the index” in a secondary index) are packed in the Field1 of a secondary index row (columns a and b are stored in the area labeled as Index Value in the row format graphic).

The system also stores the non-row compressed index values (each instance of those (x,y) values that has the same (a,b) values as a minirow in Field2) in the area labeled as Index Minirow List in the row format graphic. These columns are labeled as the column_2_name list in the CREATE JOIN INDEX syntax diagram.

The region labeled as Index Value in the following row format graphic is the column_1_name value set for the row and should be interpreted as containing the requisite pad bytes depicted in Aligned Row Format Structure for an Uncompressed Hash or Join Index With an Nonpartitioned Primary Index). It is an abbreviation for the various data type orderings and offsets shown in detail in the row format diagrams for non-row compressed hash and join index rows. The number of trailing pad bytes required by a given row is the number of bytes it takes to make the row end on a modulo(8) boundary.


""

When you row compress a hash or join index, the column_2_name value set for each row in the index is stored as a minirow. Each minirow consists of a BYTE length field, an optional presence bits array field, and a column_2_name value set (see Aligned Row Format Structure for an Uncompressed Hash or Join Index With an Nonpartitioned Primary Index or Packed64 Row Structure for a Row Compressed Hash or Join Index With an Nonpartitioned Primary Index).

Aligned Row Format Join Index Row Layout for Compressed Hash and Join Indexes With a Partitioned Primary Index

Vantage does not support PPIs for hash indexes, row-compressed join indexes, or column-partitioned join indexes.