Write All Data from a Vantage Table to External Object Storage with Partitioning - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The following example writes all the data from a native relational database table to external object storage. The data is partitioned (sorted) by the Analytics Database before being written, and the objects containing the data in external object storage are named to reflect their contents.

SELECT NodeId, AmpId, Sequence, ObjectName, ObjectSize, RecordCount
FROM WRITE_NOS (
 ON  ( SELECT * FROM RiverFlowPerm )
  PARTITION BY SiteNo ORDER BY SiteNo
 USING
  AUTHORIZATION(authorization_object)
  LOCATION('YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/')
  STOREDAS('PARQUET')
  NAMING('DISCRETE')
  MANIFESTFILE('YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/flowdata_manifest.json')
  INCLUDE_ORDERING('TRUE')
  MAXOBJECTSIZE('4MB')
  COMPRESSION('GZIP')
) AS d 
ORDER BY AmpId;

Four rows from an example result set are presented here in a vertical layout (column names on the left) for a more efficient presentation:

     NodeId          33
      AmpId           0
   Sequence           1
 ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09497500/object_33_0_1.parquet
 ObjectSize       27682
RecordCount        2944

     NodeId          33
      AmpId           0
   Sequence           1
 ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09513780/object_33_0_1.parquet
 ObjectSize       19136
RecordCount        2941

     NodeId          33
      AmpId           0
   Sequence           1
 ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09424900/object_33_0_1.parquet
 ObjectSize       23174
RecordCount        2946

     NodeId          33
      AmpId           0
   Sequence           1
ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/flowdata_manifest.json
ObjectSize          513
RecordCount           ?

[...]
Important things in this example:
  • Each row of the result set describes one object written to external object storage that contains data from Vantage.
  • Each data object contains data from one partition created according to the PARTITION BY clause. There is one partition for each data collection site, that is, for each SiteNo column value.
  • NAMING('DISCRETE') causes the object names to include the partition name, the object names include the values for SiteNo. In the object name,YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09497500/object_33_0_1.parquet, "09497500" is the SiteNo of the data contained by this object.
  • Because the query included INCLUDE_ORDERING('TRUE'), the object data includes data from the ORDER BY column, the site number from which the data in that row was collected.
  • The MANIFESTFILE option causes a manifest file to be written to external object storage. The manifest file lists all objects that WRITE_NOS wrote.