Sizing Structured UDT Columns | Database Design | Teradata Vantage - Sizing Structured UDT Columns - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
17.00
Published
June 2020
Language
English (United States)
Last Update
2021-01-22
dita:mapPath
qby1588121512748.ditamap
dita:ditavalPath
lze1555437562152.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantage™

For a distinct UDT, there is always a 1:1 correspondence between the type and its underlying predefined data type. As a result, to determine the size of a distinct UDT column, just use the size of the underlying predefined data type. Because of alignment differences for packed64 and aligned row systems, be sure to consult the size differences for the various data types on the two system types before making your row size estimates (see Data Type Size Differences For Packed64 and Aligned Row Format Architectures).

Unlike predefined data types, the size of each structured UDT you create is different, and you must calculate that size to account for the amount of storage a column with that type requires.

You cannot use either multivalue compression or algorithmic compression for structured UDT column data.

This topic first describes how a structured UDT is stored in a row, then describes how to calculate the number of bytes required to store a given structured UDT in persistent form.

The dynamic UDT data type is a structured UDT with a fixed data type name of VARIANT_TYPE, so you can calculate the sizes of VARIANT_TYPE instances in the same way you would any other structured UDT.

Storage Structure of a Structured UDT Data Type

The following graphic illustrates the basic morphology of a structured UDT as it is stored in the row of a table in the database:


Basic morphology of a structured UDT

where:

UDT element … Specifies …
Type ID the TVMID type identifier for the structured UDT.

Its data type is INTEGER (4 bytes).

Presence Bits Array the octet array of presence bits for the UDT at level m of the structured UDT.

This is a variably-sized bit array, rounded to the higher modulo(8) boundary, whose size depends on the number of attributes stored for a particular structured UDT. Its size increases in octet (8-bit byte) increments.

Attribute n Value attribute number n for the structured UDT.

There is a variable number of attribute value fields. The exact number depends on two factors:

  • The number of attributes a particular structured type has.
  • Whether or not an attribute is null.

The exact format of each attribute depends on the attribute type and is defined by the designer of the given structured type.

Storage Structure of a Nested Structured Data Type

A structured UDT can be built from attributes that are themselves structured UDTs. This topic explains the persistent storage format for a column having such a type.

The following graphic illustrates the basic morphology of a structured UDT that contains nested structured UDT attributes as it is stored in the row of a table in the database:


Structured UDT with nested structured UDT attributes

where:

UDT element … Specifies …
Type m ID the TVMID type identifier for the level m structured UDT.

The value of m is 0 for the highest level attribute set in a structured UDT, ranging to 511 for the lowest possible level attribute set in a multilevel nested structured UDT.

Structured UDT attributes can be nested through multiple levels. The lowest level is numbered 511 because there can be 512 attribute nesting levels in a structured UDT, ranging from 0 - 511.

Presence Bits Array m the octet array of presence bits for the UDT at level m of the structured UDT.

This is a variably-sized bit array, rounded to the higher modulo(8) boundary, whose size depends on the number of attributes stored for a particular structured UDT. Its size increases in octet (8-bit byte) increments.

Attribute n m Value attribute number n for the level m structured UDT component.

The number of attribute value columns is variable. The exact number depends on two factors:

  • The number of attributes a particular structured type has.
  • Whether or not an attribute is null.

The exact format of each attribute depends on the attribute type and is defined by the designer of the given structured type.

About Sizing a Structured UDT

Because the size of a structured UDT value is not an arithmetic sum of the size of its individual attributes, it presents a special problem for capacity planning.

The storage footprint for a structured UDT is composed of the following components in the order given:

  1. The TVMId Type Identifier for the UDT.
  2. A variably sized attribute presence bit array with one bit per attribute, rounded up to the nearest 8-bit boundary.
  3. A serialized list of the non-null attribute value sizes.

Example 1: Calculating the Storage Requirements of a Two-Level Structured UDT on a Packed64 Format System

What follows is a simple example of calculating the storage requirements of a two-level structured UDT on a packed64 format system:

Suppose you have the following nested structured type.

Level Number of Attributes At The Level Data Types of the Attributes
0 5
  • INTEGER
  • INTEGER
  • INTEGER
  • INTEGER
  • Structured UDT
1 3
  • INTEGER
  • INTEGER
  • INTEGER
Here is what is stored for a value having this structured type, assuming there are no null attributes. Nothing is stored to represent a null attribute other than a bit in the Presence Bits Array, which means that there is no difference in the storage of a compressed null and an uncompressed null. This means that with respect to compression, there is only one storage state for nulls, and that state is referred to as compressed. Because the representation of the states is identical, it is often said that nulls are compressed by default, but this is somewhat misleading.
  • 6 bytes for the TVMID for level 0, stored in INTEGER format.
  • 1 octet (8-bit byte) byte for the Presence Bits Array for level 0, which contains 5 presence bits for the 5 attributes at level 0 (all set to 1) and 3 unused presence bits (all set to 0).
  • 4 bytes for INTEGER value 1 at level 0.
  • 4 bytes for INTEGER value 2 at level 0.
  • 4 bytes for INTEGER value 3 at level 0.
  • 4 bytes for INTEGER value 4 at level 0.
  • Size in bytes of the level 1 structured UDT, which is:
    • 6 bytes for the TVMID for level 1, stored in INTEGER format.
    • 1 byte for the Presence Bits Array for level 1, which contains 3 presence bits for the 3 attributes at level 1 (all set to 1) and 5 unused presence bits (all set to 0).
    • 4 bytes for INTEGER value 1 at level 1.
    • 4 bytes for INTEGER value 2 at level 1.
    • 4 bytes for INTEGER value 3 at level 1.

      Note that there is overhead of a TVMID value and a Presence Bits Array for each nesting level in a structured UDT.

      For an aligned row format system, you just take the calculated size for a packed64 format system modulo(8) to align the column on an 8-byte boundary.

Example 2: Packed64 System

Now consider the following more concrete example, again for a packed64 system.

Suppose you create two structured types, one of which is an attribute of the other, as follows:

CREATE TYPE name_udt AS (
  first_name VARCHAR(20),
  last_name  VARCHAR(20));

CREATE TYPE address_udt AS (
  street  VARCHAR(20),
  city    VARCHAR(20),
  zipcode INTEGER,
  name    NameUdt);

For the sake of this example, assume the TypeID for name_udt is 33 and the TypeID for address_udt is 999.

Suppose you insert the following data into a column typed as address_udt:

INSERT INTO test_table
VALUES (NEW address_udt().street(‘Apple Tree Way’)
                        .city(‘Washington D.C.’)
                        .zipcode(10776)
                        .name
       (NEW name_udt()   .first_name(‘Abraham’)
                        .last_name(‘Lincoln’)));

The nested structured type address_udt, which nests the UDT name_udt as one of its attributes, is stored as indicated by the following graphic.


Nested structure with attributes