Write All Data from a Vantage Table to External Storage with Partitioning - Analytics Database - Teradata Vantage

SQL Data Manipulation Language

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-05
dita:mapPath
pon1628111750298.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
esx1472246586715
lifecycle
latest
Product Category
Teradata Vantage™

The following example writes all the data from a native relational database table to external object store. The data is partitioned (sorted) by the Vantage Analytics Database before being written, and the objects containing the data in external storage are named to reflect the sorted data they contain.

SELECT NodeId, AmpId, Sequence, ObjectName, ObjectSize, RecordCount
FROM WRITE_NOS (
 ON  ( SELECT * FROM RiverFlowPerm )
  PARTITION BY SiteNo ORDER BY SiteNo
 USING
  AUTHORIZATION(authorization_object
  LOCATION('YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/')
  STOREDAS('PARQUET')
  NAMING('DISCRETE')
  MANIFESTFILE('YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/flowdata_manifest.json')
  INCLUDE_ORDERING('TRUE')
  MAXOBJECTSIZE('4MB')
  COMPRESSION('GZIP')
) AS d 
ORDER BY AmpId;

Four rows from an example result set are presented here in a vertical layout (column names on the left) for a more efficient presentation:

     NodeId          33
      AmpId           0
   Sequence           1
 ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09497500/object_33_0_1.parquet
 ObjectSize       27682
RecordCount        2944

     NodeId          33
      AmpId           0
   Sequence           1
 ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09513780/object_33_0_1.parquet
 ObjectSize       19136
RecordCount        2941

     NodeId          33
      AmpId           0
   Sequence           1
 ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09424900/object_33_0_1.parquet
 ObjectSize       23174
RecordCount        2946

     NodeId          33
      AmpId           0
   Sequence           1
ObjectName YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/flowdata_manifest.json
ObjectSize          513
RecordCount           ?

[...]
Things to note in this example:
  • Each row of the result set describes one object written to external storage that contains data from Vantage.
  • Each data object contains data from one partition created according to the PARTITION BY clause. In this case there is one partition for each data collection site, that is, for each SiteNo column value.
  • NAMING('DISCRETE') causes the object names to include the partition name, in this case the object names include the values for SiteNo. In the object name,YOUR-STORAGE-ACCOUNT/RiverFlowPerm_PartitionBy/09497500/object_33_0_1.parquet, "09497500" is the SiteNo of the data contained by this object.
  • Because the query included INCLUDE_ORDERING('TRUE'), the object data includes data from the ORDER BY column, in this case the site number from which the data in that row was collected.
  • The MANIFESTFILE option causes a manifest file to be written to external storage. The manifest file lists all objects that WRITE_NOS wrote.