Write All Data to External Store with Partitioning | NOS | Teradata Vantage - Examples: Write All Rows from a Table to External Object Storage Using Partitioning - Analytics Database - Teradata Vantage

Teradata Vantage™ - Native Object Store Getting Started Guide - 17.20

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
ft:locale
en-US
ft:lastEdition
2024-10-04
dita:mapPath
tsq1628112323282.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
jjn1567647976698
lifecycle
latest
Product Category
Teradata Vantage
The following examples write all the data stored in a database table to external object storage. The data is partitioned (sorted) by Analytics Database before being written. The objects containing the data in external storage are named to reflect the sorted data they contain.

    Prerequisites

  1. If not already done, set up or obtain access to the external object storage where you want to write data. Follow the instructions from your external storage vendor.
  2. Example: Write All Rows to External Object Storage, Use PARTITION BY and NAMING('DISCRETE')

  3. Write all the rows from a database table to external object storage:
    SELECT * FROM WRITE_NOS (
    ON ( SELECT * FROM RiverFlowPerm )
    PARTITION BY site_no ORDER BY site_no
    USING
    LOCATION('YOUR-OBJECT-STORE-URI/RiverFlowPerm_Discrete/')
    AUTHORIZATION(MyAuthObj)
    STOREDAS('PARQUET')
    COMPRESSION('SNAPPY')
    NAMING('DISCRETE')
    INCLUDE_ORDERING('TRUE')
    ) AS d;

    Replace LOCATION with the URI to the external object storage where you want to write the data.

    Things to note in this example:
    • Each row of the result set describes one object written to external storage that contains data from Vantage.
    • Each object contains data from one partition created according to the PARTITION BY clause. In this case there is one partition for each data collection site, that is, for each SiteNo column value.
    • NAMING('DISCRETE') causes the object names to include the partition name, in this case the object names include the values for SiteNo. In the object name, /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09497500/object_33_0_1.parquet, "09497500" is the SiteNo of the data contained by this object.
    • Because the query included INCLUDE_ORDERING('TRUE'), the objects include data from the ORDER BY column(s), in this case the object data includes the site number from which the data in that row was collected.

    Your result will be similar to the following:

         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09497500/object_33_0_1.parquet
     ObjectSize                27682
    RecordCount                 2944
    
         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09513780/object_33_0_1.parquet
     ObjectSize                19136
    RecordCount                 2941
    
         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09424900/object_33_0_1.parquet
     ObjectSize                23174
    RecordCount                 2946
    
    [...]

    The output is displayed vertically for readability.