Write All Data to External Storage with Partitioning | NOS | Teradata Vantage - Examples: Write All Rows from a Table to External Storage Using Partitioning - Advanced SQL Engine - Teradata Database

Teradata Vantageā„¢ - Native Object Store Getting Started Guide

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
Published
January 2021
Language
English (United States)
Last Update
2021-01-22
dita:mapPath
zws1595641486108.ditamap
dita:ditavalPath
hoy1596145193032.ditaval
dita:id
B035-1214
lifecycle
previous
Product Category
Software
Teradata Vantage
The following examples write all the data stored in a database table to external object storage. The data is partitioned (sorted) by Advanced SQL Engine before being written. The objects containing the data in external storage are named to reflect the sorted data they contain.

    Prerequisites

  1. If not already done, set up an authorization to access external object store. See Setting Up Authorization for WRITE_NOS Examples.
  2. If not already done, set up a database table to extract data from. You can use one of your existing tables or set up a test table to use with the examples. See Setting Up a Test Table.
  3. If not already done, set up or obtain access to the external object store where you want to write data. Follow the instructions from your external storage vendor.
  4. Example: Write All Rows to External Object Store, Use PARTITION BY and NAMING('RANGE')

  5. Write all the rows from a database table to external object store:
    SELECT NodeId, AmpId, Sequence, ObjectName, ObjectSize, RecordCount
    FROM WRITE_NOS_FM (
         ON  ( SELECT * FROM RiverFlowPerm )
            PARTITION BY SiteNo ORDER BY SiteNo
         USING
            LOCATION('YOUR-STORAGE-ACCOUNT/RiverFlowPerm/')
            STOREDAS('PARQUET')
            COMPRESSION('GZIP')
            NAMING('RANGE')
            INCLUDE_ORDERING('TRUE')
            MAXOBJECTSIZE('4MB')
    ) AS d 
    ORDER BY AmpId;

    Replace LOCATION with the URI to the external object store where you are writing the data.

    Things to note in this example:
    • Each row of the result set describes one object written to external storage that contains data from Advanced SQL Engine.
    • Each object contains data from one partition created according to the PARTITION BY clause. In this case there is one partition for each data collection site, that is, for each SiteNo column value.
    • NAMING('RANGE') causes the object names to include the upper and lower values in the range. In this case the object names include the values for SiteNo. For example, in the object name, /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm/09497500/09497500/object_33_0_1.parquet, "09497500" is the SiteNo of the data contained by this object. This object contains data from multiple site numbers between sites 09497500 and 09497500, inclusive.
    • Because the query included INCLUDE_ORDERING('TRUE'), the objects include data from the ORDER BY column(s), in this case the object data includes the site number from which the data in that row was collected.

    Your result will be similar to the following:

         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm/09497500/09497500/object_33_0_1.parquet
     ObjectSize                27682
    RecordCount                 2944
    
         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm/09513780/09513780/object_33_0_1.parquet
     ObjectSize                19136
    RecordCount                 2941
    
    [...]

    For readability, the output is displayed vertically.

  6. Verify the offloaded data on external object storage:
    1. Create a foreign table to access external object store:
      CREATE FOREIGN TABLE ftRiverFlowPerm,
      EXTERNAL SECURITY DEFINER TRUSTED DefAuth_Write (
          Location VARCHAR(2048) CHARACTER SET UNICODE CASESPECIFIC,
       SiteNo VARCHAR(8) CHARACTER SET UNICODE,
       DateTime TIMESTAMP(0) FORMAT 'YYYY-MM-DDBHH:MI',
       Flow FLOAT,
       GageHeight1 FLOAT,
       Precipitation FLOAT,
       Temperature FLOAT,
       Velocity FLOAT,
       BatteryVoltage FLOAT,
       GageHeight2 FLOAT,
       WaterVelocity FLOAT,
       Conductance FLOAT
      )
      USING 
      (
          LOCATION('YOUR-STORAGE-ACCOUNT/RiverFlowPerm/')
          STOREDAS('PARQUET')
      ) 
      PARTITION BY COLUMN;
         

      Replace LOCATION with the URI to the external object store where you wrote the data.

    2. Count the number of rows in the external data:
      SELECT COUNT(*) FROM ftRiverFlowPerm;
      Result:
      Count(*)
      --------
      36301
    3. Sample one row from the external data using the foreign table called ftRiverFlowPerm:
      SELECT * FROM ftRiverFlowPerm SAMPLE 1;
      Result:
            Location /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm/09429070/09429070/object_33_1_1.parquet
              SiteNo 09429070
            DateTime 2018-07-16 22:00
                Flow  1.55000000000000E 002
         GageHeight1  4.38000000000000E 000
       Precipitation                      ?
         Temperature  8.33000000000000E 001
            Velocity  1.15000000000000E 000
      BatteryVoltage                      ?
         GageHeight2                      ?
       WaterVelocity                      ?
         Conductance                      ?

      For readability, the output is displayed vertically.

  7. Example: Write All Rows to External Object Store, Use PARTITION BY and NAMING('DISCRETE')

  8. Write all the rows from a database table to external object store:
    SELECT NodeId, AmpId, Sequence, ObjectName, ObjectSize, RecordCount
    FROM WRITE_NOS_FM (
     ON  ( SELECT * FROM RiverFlowPerm )
      PARTITION BY SiteNo ORDER BY SiteNo
     USING
      LOCATION('YOUR-STORAGE-ACCOUNT/RiverFlowPerm_Discrete/')
      STOREDAS('PARQUET')
      COMPRESSION('GZIP')
      NAMING('DISCRETE')
      INCLUDE_ORDERING('TRUE')
      MAXOBJECTSIZE('4MB')
    ) AS d 
    ORDER BY AmpId;

    Replace LOCATION with the URI to the external object store where you want to write the data.

    Things to note in this example:
    • Each row of the result set describes one object written to external storage that contains data from Vantage.
    • Each object contains data from one partition created according to the PARTITION BY clause. In this case there is one partition for each data collection site, that is, for each SiteNo column value.
    • NAMING('DISCRETE') causes the object names to include the partition name, in this case the object names include the values for SiteNo. In the object name, /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09497500/object_33_0_1.parquet, "09497500" is the SiteNo of the data contained by this object.
    • Because the query included INCLUDE_ORDERING('TRUE'), the objects include data from the ORDER BY column(s), in this case the object data includes the site number from which the data in that row was collected.

    Your result will be similar to the following:

         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09497500/object_33_0_1.parquet
     ObjectSize                27682
    RecordCount                 2944
    
         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09513780/object_33_0_1.parquet
     ObjectSize                19136
    RecordCount                 2941
    
         NodeId          33
          AmpId           0
       Sequence                    1
     ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09424900/object_33_0_1.parquet
     ObjectSize                23174
    RecordCount                 2946
    
    [...]

    For readability the output is displayed vertically.