15.00 - Equation For Sizing a Structured UDT - Teradata Database

Teradata Database Design

prodname
Teradata Database
vrm_release
15.00
category
User Guide
featnum
B035-1094-015K

Equation For Sizing a Structured UDT

Because the size of a structured UDT value is not an arithmetic sum of the size of its individual attributes, it presents a special problem for capacity planning.

The storage footprint for a structured UDT is composed of the following components in the order given:

1 The TVMId Type Identifier for the UDT.

2 A variably sized attribute presence bit array with one bit per attribute, rounded up to the nearest 8‑bit boundary.

3 A serialized list of the non-null attribute value sizes.

Equation: Structured UDT Size for Packed64 Format System

The equation for determining the size of a structured UDT column for a packed64 format system is as follows:

Equation: Structured UDT Size for an Aligned Row System

The equation for determining the size of a structured UDT column for an aligned row system is as follows:

where:

 

Equation element …

Represents the …

UDT_size

total size in bytes of the structured UDT.

MOD(8)

modulo(8) factor required to align the structured UDT on an 8‑byte boundary.

TVMID_size

size in bytes of the DBC.TVMID for structured UDTs.

This is fixed at 6 bytes.

There is a DBC.TVMID size for each nested level of a structured UDT.

number_of_attributes

number of attributes in the structured UDT.

The value is rounded up to the next higher modulo(8) boundary to pad the Presence Bits Array to a full 8 bits if necessary.

This factor, which accounts for the Presence Bits Array, applies to each nested level of a structured UDT.

fixed_size_non‑null_
attributes

total size in bytes of all the fixed‑size non‑null attribute values in the structured UDT.

The set of fixed size attributes is composed of the following elements:

  • Fixed‑size predefined data types such as INTEGER, DECIMAL, and CHARACTER.
  • Fixed‑size UDT types based on fixed‑size predefined data types.
  • You must also account for nested attributes in this calculation, each level of which carries its own DBC.TVMID_size and Presence Bits Array.

    variable_size_non-null_attributes

    total size in bytes of all the variable‑length non‑null attribute values in the structured UDT.

    Note that the calculation includes a 2‑byte length indicator for each variable‑length data type attribute in the UDT.

    For example, suppose you store the following character string in an attribute defined with the VARCHAR(20) predefined data type:

       sorry about that

    This character string is stored in the following format:

       2‑byte length indicator, attribute value

       length

    Given this information, you can see that the example string ‘sorry about that’ would be stored as follows:

       { 16, ‘sorry about that’ }

    where:

  • 16 is the length of the stored value for the variable length string in bytes: 14 lowercase Latin characters plus 2 pad characters.
  • sorry about that is the value of that string.
  • The set of variable‑size attributes is composed of the following elements:

  • Variably‑sized predefined data types such as VARCHAR and VARBYTE.
  • Variably‑sized UDT types based on variably‑sized predefined data types.
  • non‑null_LOB_
    attribute_OIDs

    total size in bytes of all the non‑null OID references to values stored in BLOB, CLOB, and XML subtables.

    BLOB, CLOB, and XML values are never stored in the row. Instead, the system stores a 40‑byte pointer to each BLOB, CLOB, or XML column value called an object identifier (OID). The LOB value pointed to is stored within a BLOB, CLOB, or XML subtable associated with that column of the table. See “Object Identifier Columns” on page 820 and “Sizing a LOB or XML Subtable” on page 861 for more information about OIDs and BLOB, CLOB, and XML subtables.

    The set of BLOB, CLOB, and XML attribute OIDs is composed of the following elements:

  • OIDs for attribute values having BLOB, CLOB, or XML types.
  • OIDs for attribute values having UDT types based on BLOB, CLOB, or XML data types.
  • Each OID for a BLOB, CLOB, or XML column has a size of 40 bytes.

    Because the metadata within a structured UDT is byte‑packed (meaning not aligned on an 8‑byte boundary. Teradata Database automatically copies structured UDT values into properly aligned storage locations in memory whenever they are used by a system with a 64‑bit byte‑aligned format architecture) on both packed64 and aligned row format systems, it has no effect on the stored size of a given structured UDT. The entire structured UDT, however, is aligned on an 8‑byte boundary on aligned row format systems.

    Example  

    What follows is a simple example of calculating the storage requirements of a two‑level structured UDT on a packed64 format system:

    Suppose you have the following nested structured type.

     

    Level

    Number of Attributes At The Level

    Data Types of the Attributes

    0

    5

  • INTEGER
  • INTEGER
  • INTEGER
  • INTEGER
  • Structured UDT
  • 1

    3

  • INTEGER
  • INTEGER
  • INTEGER
  • Here is what is stored for a value having this structured type, assuming there are no null attributes. Nothing is stored to represent a null attribute other than a bit in the Presence Bits Array, which means that there is no difference in the storage of a compressed null and an uncompressed null. This means that with respect to compression, there is only one storage state for nulls, and that state is referred to as compressed. Because the representation of the states is identical, it is often said that nulls are compressed by default, but this is somewhat misleading.

  • 6 bytes for the TVMID for level 0, stored in INTEGER format.
  • 1 octet (8‑bit byte) byte for the Presence Bits Array for level 0, which contains 5 presence bits for the 5 attributes at level 0 (all set to 1) and 3 unused presence bits (all set to 0).
  • 4 bytes for INTEGER value 1 at level 0.
  • 4 bytes for INTEGER value 2 at level 0.
  • 4 bytes for INTEGER value 3 at level 0.
  • 4 bytes for INTEGER value 4 at level 0.
  • Size in bytes of the level 1 structured UDT, which is:
  • 6 bytes for the TVMID for level 1, stored in INTEGER format.
  • 1 byte for the Presence Bits Array for level 1, which contains 3 presence bits for the 3 attributes at level 1 (all set to 1) and 5 unused presence bits (all set to 0).
  • 4 bytes for INTEGER value 1 at level 1.
  • 4 bytes for INTEGER value 2 at level 1.
  • 4 bytes for INTEGER value 3 at level 1.
  • Note that there is overhead of a TVMID value and a Presence Bits Array for each nesting level in a structured UDT.

    For an aligned row format system, you just take the calculated size for a packed64 format system modulo(8) to align the column on an 8‑byte boundary.

    Example  

    Now consider the following more concrete example, again for a packed64 system.

    Suppose you create two structured types, one of which is an attribute of the other, as follows:

         CREATE TYPE name_udt AS (
           first_name VARCHAR(20),
           last_name  VARCHAR(20));
     
         CREATE TYPE address_udt AS (
           street  VARCHAR(20),
           city    VARCHAR(20),
           zipcode INTEGER,
           name    NameUdt);

    For the sake of this example, assume the TypeID for name_udt is 33 and the TypeID for address_udt is 999.

    Suppose you insert the following data into a column typed as address_udt:

         INSERT INTO test_table 
         VALUES (NEW address_udt().street(‘Apple Tree Way’)
                                 .city(‘Washington D.C.’)
                                 .zipcode(10776)
                                 .name
                (NEW name_udt()   .first_name(‘Abraham’)
                                 .last_name(‘Lincoln’)));

    The nested structured type address_udt, which nests the UDT name_udt as one of its attributes, is stored as indicated by the following graphic.

    Following “Equation: Structured UDT Size for Packed64 Format System” on page 795, the size of any value having the address_udt type is calculated as follows:

    For an aligned row system, you just take the calculated size for a packed64 format system modulo(8) to align the column on an 8‑byte boundary.