15.00 - Process for Locating a Row Using Its Unique Primary Index - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Process for Locating a Row Using Its Unique Primary Index

The following process describes how Teradata Database locates the row having the unique primary index value column_value. Teradata Database follows the same process if the primary index is non‑unique. The only difference is that the query might retrieve more than a single row in that case because multiple rows could have the same value for column_value.

1 The Resolver does a data dictionary lookup to determine the table ID for table_1, then adds it to the parse tree for later use by the Generator.

2 The Generator hashes the primary index value column_value, computing its 32-bit row hash value.

3 The Generator builds a 3-part message from the following information.

  • Table ID for table_1
  • The number of bits for the row hash value is always 32, but the number of bits used to represent the hash bucket number and remainder vary depending on the number of hash buckets defined for the system.
  •  

    IF the system has this many hash buckets …

    THEN the row hash for column_value is 32 bits wide divided between the hash bucket number and remainder as follows …

    65,536

  • 16‑bit hash bucket number
  • 16‑bit remainder
  • 1,048,576

  • 20‑bit hash bucket number
  • 12‑bit remainder
  • See “Teradata Database Hashing Algorithm” on page 225 for details.

  • Data value for column_value
  • 4 The Generator scans the hash map to determine which AMP owns the hash bucket the row belongs to.

    5 The message is inserted into an AMP step, the Dispatcher places it on the BYNET, and sends it point-to-point to the AMP identified by the hash map.

    6 The file system on the receiving AMP uses the table ID and row hash value as a key to scan its master index for the cylinder number that contains the data block in which the row is stored.

    7 The file system first determines if the cylinder index is cached.

     

    WHEN the cylinder index is …

    THEN the file system …

    cached

    scans it for the data block.

    not cached

    retrieves it from disk and scans for the data block (see stage 8).

    8 The file system uses the table ID, row hash value, and cylinder number as a key to scan the cylinder index for the data block address known to contain the row being retrieved.

    9 The file system determines if the data block is cached.

     

    WHEN the data block is …

    THEN the file system …

    cached

    scans it for the row.

    not cached

    retrieves it from disk and scans for the row (see stage 10).

    10 The AMP uses the row hash value and primary index value as a key to scan the data block for the row being retrieved. The AMP checks to see if the row contains the desired values, and if it does, returns the values for column_1 and column_2 to the requestor.