Write a Portion of Data to External Storage | NOS | Teradata Vantage - Example: Offload a Portion of Data to External Object Store - Advanced SQL Engine - Teradata Database

Teradata Vantageā„¢ - Native Object Store Getting Started Guide

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
Published
January 2021
Language
English (United States)
Last Update
2021-01-22
dita:mapPath
zws1595641486108.ditamap
dita:ditavalPath
hoy1596145193032.ditaval
dita:id
B035-1214
lifecycle
previous
Product Category
Software
Teradata Vantage
The example shows how to write data from an Advanced SQL Engine table to external object store.

    Prerequisites

  1. If not already done, set up an authorization object and function mapping. See Setting Up Authorization for WRITE_NOS Examples.
  2. If not already done, set up a database table to extract data from. You can use one of your existing tables or set up a test table to use with the examples. See Setting Up a Test Table.
  3. Set up or obtain access to the external storage where you want to write data. Follow the instructions from your external storage vendor.
  4. Write Data to External Object Store

  5. Write a portion of data from a database table to external object store. For example, read JSON data from the test table called RiverFlowPerm and write it to your external storage:
    SELECT NodeId, AmpId, Sequence, ObjectName, ObjectSize, RecordCount
    FROM WRITE_NOS_FM (
         ON  ( SELECT * FROM RiverFlowPerm WHERE DateTime = (TIMESTAMP '2018-07-01 00:00:00') )
         USING
            LOCATION('YOUR-STORAGE-ACCOUNT/20180701/')
            STOREDAS('PARQUET')
    ) AS d 
    ORDER BY AmpId;

    Replace LOCATION with the URI to the external object store where you want to write the data.

    WRITE_NOS stores data in Parquet format.

    Your result will be similar to the following:

    NodeId AmpId Sequence ObjectName
    ------ ----- -------- -------- --------------------------------------------------------------
        33     0        1 /S3/s3.amazonaws.com/ie-writenos-bucket/20180701/object_33_0_1.parquet
        33     1        1 /S3/s3.amazonaws.com/ie-writenos-bucket/20180701/object_33_1_1.parquet
        33     2        1 /S3/s3.amazonaws.com/ie-writenos-bucket/20180701/object_33_2_1.parquet
  6. Verify Offloaded Data on External Object Storage

  7. Create a foreign table to access the external object store you just wrote to:
    CREATE FOREIGN TABLE ft20180701,
    EXTERNAL SECURITY DEFINER TRUSTED DefAuth_Write (
         Location VARCHAR(2048) CHARACTER SET UNICODE CASESPECIFIC,
         SiteNo VARCHAR(8) CHARACTER SET UNICODE,
         DateTime TIMESTAMP(0) FORMAT 'YYYY-MM-DDBHH:MI',
         Flow FLOAT,
         GageHeight1 FLOAT,
         Precipitation FLOAT,
         Temperature FLOAT,
         Velocity FLOAT,
         BatteryVoltage FLOAT,
         GageHeight2 FLOAT,
         WaterVelocity FLOAT,
         Conductance FLOAT
    )
    USING 
    (
         LOCATION('YOUR-STORAGE-ACCOUNT/20180701/')
         STOREDAS('PARQUET')
    ) 
    PARTITION BY COLUMN;

    Replace LOCATION with the URI to the external object store where you wrote the data.

  8. Count the number of rows in the external data:
    SELECT COUNT(*) FROM ft20180701;
    Result:
    Count(*)
    --------
    12
  9. Sample one row from the external data:
    SELECT * FROM ft20180701 SAMPLE 1;
    Result:
          Location /S3/s3.amazonaws.com/ie-writenos-bucket/20180701/object_33_1_1.parquet
            SiteNo 09429070
          DateTime 2018-07-01 00:00
              Flow  1.54000000000000E 002
       GageHeight1  6.06000000000000E 000
     Precipitation                      ?
       Temperature  7.90000000000000E 001
          Velocity  7.50000000000000E-001
    BatteryVoltage                      ?
       GageHeight2                      ?
     WaterVelocity                      ?
       Conductance                      ?

    For readability the output is displayed vertically.

  10. Compare some of the data in the database table to the data stored externally. First, count the number of rows in the database table that match the criteria specified in the WHERE clause:
    SELECT COUNT(*) FROM RiverFlowPerm WHERE Temperature > 7.0 AND DateTime = (TIMESTAMP '2018-07-01 00:00:00');
    Result:
    Count(*)
    --------
    4

    Then, count the number of lines in the external object store that match the same criteria.

    SELECT COUNT(*) FROM RiverFlowPerm s, ft20180701 ft
    WHERE  s.SiteNo = ft.SiteNo
    AND    s.Temperature > 7.0
    AND    s.DateTime = (TIMESTAMP '2018-07-01 00:00:00');
    Result:
    Count(*)
    --------
    4

    The result from both queries should match.

  11. Read the Parquet schema of the offloaded data from external object store:
    SELECT DISTINCT(CAST(ParquetColumnName AS CHAR(15))) (NAMED Name),
         ParquetColumnPos (NAMED Pos),
         CAST(ParquetColumnPhysicalType AS CHAR(20)) (NAMED Physical),
         CAST(ParquetColumnLogicalType AS CHAR(20)) (NAMED Logical),
         CAST(TDColumnType AS CHAR(10)) (NAMED "Type"),
         ParquetColumnPrecision (NAMED "Precision"),
         ParquetColumnScale (NAMED Scale),
         ParquetColumnMinLength (NAMED "Min"),
         ParquetColumnMaxLength (NAMED "Max")
    FROM READ_NOS_FM1 (
    USING
         LOCATION('YOUR-STORAGE-ACCOUNT/20180701/')
         STOREDAS('PARQUET')
         FULLSCAN('TRUE')
         RETURNTYPE('NOSREAD_PARQUET_SCHEMA')
    ) as d order by 2,1,3,4,5,6,7,8,9;

    Replace LOCATION with the URI to the external object store where you wrote the data.

    Result:

         Name  SiteNo
          Pos     1
     Physical  BYTE_ARRAY
      Logical  UTF8
         Type  VARCHAR
    Precision            0
        Scale            0
          Min            8
          Max            8
    
         Name  DateTime
          Pos     2
     Physical  INT64
      Logical  TIMESTAMP_MILLIS
         Type  TIMESTAMP
    Precision            0
        Scale            0
          Min            0
          Max            0
    	  
         Name  Flow
          Pos     3
     Physical  DOUBLE
      Logical  NONE
         Type  FLOAT
    Precision            0
        Scale            0
          Min            0
          Max            0
    
    [...]

    For readability, the output is displayed vertically.