Schema Evolution - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The NOS Schema Evolution feature facilitates evolving the foreign table schema based on any structural changes to Parquet files. You can add, update, or remove columns from any position within the file without necessitating a complete rebuilding of the Data Definition Language (DDL). Schema evolution of the Parquet file can be incorporated into the DDL using ALTER FOREIGN TABLE.

This feature enables flexible schema evolution for Parquet files and foreign table DDLs. It supports column sequencing (re-ordering, re-positioning, removing) when data is added to existing Parquet files.

Behavior Allowed in Parquet Files with and without Schema Evolution

Behavior Index-Based Approach (without Schema Evolution Column Name-Based Approach (with Schema Evolution)
Re-order columns Re-ordering of column does not give correct results. Re-ordering of column gives correct results.
Add columns at the beginning or middle of the file Addition of columns in beginning and middles is not allowed, it allows only addition at the end. Addition of the column can be done at any position.
Remove columns from the beginning or middle of the file Removal of columns in beginning and middles is not allowed, it allows only removal from the end. Removal of the column can be done at any position.
Case-sensitive column name, e.g., EMPID, empid, EmpId Any kind of duplicates is allowed. Only duplicates with different case will be allowed.
Exact duplicate name Any kind of duplicates is allowed. Processing exact duplicate names would be ambiguous, so this can’t be supported

Behavior Allowed in Foreign Table Definition with and without Schema Evolution

Behavior Index-Based Approach (without Schema Evolution Column Name-Based Approach (with Schema Evolution)
Add new column introduced in the Parquet file Allow addition of only those columns which are added at the end of the Parquet file. Columns can be at any position in the Parquet file. In DDL, the column is added at the end.
Remove unwanted column from the Parquet file Allow removal of only those columns which are added at the end of the Parquet file. Columns can be removed from any position in Parquet files.
User-created column You can create columns equal to or less than the number of columns in the actual Parquet file.

Corresponding columns should have a valid datatype matched to the Parquet files, else an error will be thrown.

SELECT returns output for columns with valid names and data type.

You can create a maximum of 2048 columns in the DDL.

No validation on the column name/data type will be performed.

Limitation

Parquet files cannot have duplicate column names as they are case-sensitive. For example, column names can be specified as EMPID, empid, and EmpID but not EMPID and EMPID.