This example shows how to create a foreign table for accessing Parquet data. The table definition includes an authorization object for accessing the remote repository containing the river flow Parquet files.
This statement creates an authorization object called DefAuth_S3 with definer trusted credentials.
CREATE AUTHORIZATION DefAuth_S3 AS DEFINER TRUSTED USER 'YOUR-ACCESS-KEY-ID' PASSWORD 'YOUR-SECRET-ACCESS-KEY';
The schema for the Parquet data looks like this:
message schema { optional double GageHeight2; optional double Flow; optional int64 site_no; optional binary datetime (UTF8); optional double Precipitation; optional double GageHeight; }
This statement creates the foreign table and specifies the authorization DefAuth_S3. This table definition defines a column with the corresponding Teradata data type for each Parquet logical data type in the river flow data. The column definitions must match the data formats and positions in the Parquet file. Parquet foreign tables are partitioned by column.
CREATE FOREIGN TABLE riverflow_parquet , EXTERNAL SECURITY DEFINER TRUSTED DefAuth_S3 ( Location VARCHAR(2048) CHARACTER SET UNICODE CASESPECIFIC , GageHeight2 DOUBLE PRECISION , Flow DOUBLE PRECISION , site_no BIGINT , datetime VARCHAR(16) CHARACTER SET UNICODE CASESPECIFIC , Precipitation DOUBLE PRECISION , GageHeight DOUBLE PRECISION ) USING ( LOCATION ('/S3/td-usgs.s3.amazonaws.com/PARQUETDATA') STOREDAS ('PARQUET') ) NO PRIMARY INDEX PARTITION BY COLUMN ;
This statement retrieves a row count, maximum, and minimum values from riverflow_parquet .
SEL COUNT(Flow) AS count_col , MAX(Flow) ( FORMAT '-ZZZZ9.99') AS max_col , MIN(Flow) ( FORMAT '-ZZZZ9.99') AS min_col FROM riverflow_parquet;
Result:
count_col max_col min_col ----------- --------- --------- 15406 1570.00 0.00