Overview | CollectStats | Teradata OTF - Overview - Teradata Vantage

Teradata® Open Table Format for Apache Iceberg and Delta Lake User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Lake
Product
Teradata Vantage
Release Number
20.00
Published
October 2025
ft:locale
en-US
ft:lastEdition
2025-10-25
dita:mapPath
qrj1749167830193.ditamap
dita:ditavalPath
lli1749584660955.ditaval
dita:id
bsr1702324250454

In OTF, statistics on a dataset of an OTF column are stored in a Metadata JSON file for query optimization.

However, this standard approach has key limitations compared to Teradata’s statistics mechanisms:

  • No Histogram Support: Only scalar stats (for example, null count, distinct values) are collected, whereas Teradata supports histograms.
  • No Expression-Based Stats: Stats can be collected on columns or column combinations, but not on expressions (for example, (temperature-32)/1.8), which Teradata supports.
  • Limited Stat Types: The data sketch only provides unique value counts. Teradata collects additional stats like high mode frequency, partial null counts, and unique values on partial-null entries.

This feature builds on Teradata’s existing infrastructure, aligning with industry standards. The key focus areas include:

  • Teradata Syntax Support: Commands like COLLECT STATS, DROP STATS, HELP STATS, and SHOW STATS now support OTF stats. However, using the AS clause to name stats entries is not allowed.
  • Histogram Collection: Supports column-based histograms and stats like high mode frequency, similar to Teradata’s native capabilities. Expression-based stats are not supported due to naming and API limitations.
  • Optimizer Behavior: Optimizer first checks StatsTbl, then falls back to Metadata JSON - allowing use of vendor-generated OTF stats.