Database Design for an Object File System - Teradata VantageCloud Lake

Lake - Database Reference

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
ohi1683672393549.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
ohi1683672393549

For a VantageCloud Lake system, with the considerations and tradeoffs of Object File System tables as described in Object File System Considerations, pay attention to the following items when designing a table for an Object File System:

  • An Object File System table has no primary index (NoPI), does not support partitioned primary index, and is always defined as multiset.

    The Object File System DDL supports an ORDER BY clause that provides a similar functionality as a partitioned primary index in the file system.

  • There is less need for secondary indexes:
    • Use of Parquet (a columnar format) in the Object File System:

      Columnar format offers the ability to partition a table by column. In its simplest form, each column in the table becomes is own column partition. Values from the same column partition taken from multiple logical rows are packed together into a physical space where they can be easily accessed together. A column partition in a columnar table offers similar advantages as a secondary index.

    • Internal indexes:

      When a table is loaded into the Object File System, metadata structures similar to indexes are generated to capture the range of values for specified columns in each object. There may be many such metadata structures for a given Object File System table. These system-defined indexes can be used to speed up access, replacing the need for additional user-defined indexes.

  • Even though Object File System tables are not supported for a multiple-table join index, single-table join indexes (STJI) can be created on an Object File System table. See Join Index .
  • No need for compression.
  • See Data Storage for information about data placement options in different storage types to achieve a balance of performance and cost: Block Storage on the primary cluster or using the Object File System.

Row-Based Format and Column-Based Format

Tradeoffs to consider when choosing between row-based format (Teradata binary row format) and column-based format (Parquet based):
  • Considering degree of compression, columnar format typically has greater compression compared to row format.
  • More granular partitioning is possible with Columnar, leading to more effective partition elimination.
  • Row format is a better choice for row at a time updates.
  • Considering suitability for tactical (PI access) applications, tactical applications perform more efficiently when the table is in row format.