The example shows how to write all the rows in a table from Advanced SQL Engine to a Teradata-supported external object store.
- If not already done, set up an authorization to access external object store. See Setting Up Authorization for WRITE_NOS Examples.
- If not already done, set up a database table to extract data from. You can use one of your existing tables or set up a test table to use with the examples. See Setting Up a Test Table.
- If not already done, set up or obtain access to the external object store where you want to write data. Follow the instructions from your external storage vendor.
-
Write all rows using HASH BY SiteNo, NAMING('RANGE'):
SELECT NodeId, AmpId, Sequence, ObjectName, ObjectSize, RecordCount FROM WRITE_NOS_FM ( ON ( SELECT * FROM RiverFlowPerm ) HASH BY SiteNo LOCAL ORDER BY SiteNo USING LOCATION('YOUR-STORAGE-ACCOUNT/RiverFlowPerm_HashBy/') STOREDAS('PARQUET') COMPRESSION('GZIP') NAMING('RANGE') INCLUDE_ORDERING('TRUE') INCLUDE_HASHBY('TRUE') MAXOBJECTSIZE('4MB') ) AS d ORDER BY AmpId;
Replace LOCATION with the URI to the external object store where you want to write the data.
Things to note in this example:- Each row of the result set describes one object written to external storage that contains data from Vantage.
- Each object contains data grouped according to to the HASH BY clause. In this example it determines how the database table row data is distributed and grouped to different AMPs using the hash of the SiteNo column values. In this case there is one partition for the data on each AMP, which may include a combination of data rows from different collection sites.
-
NAMING('RANGE') causes the object names to include the upper and lower values in the range of site numbers which data the object contains.
In the object name /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_HashBy/09394500/09513780/object_33_0_1.parquet contains data from multiple site numbers between sites 09394500 and 09513780, inclusive.
- Because the query included INCLUDE_ORDERING('TRUE') and INCLUDE_HASHBY('TRUE'), the objects include data from the ORDER BY and HASH BY columns. In this case both are the same column, SiteNo, so the data contained in the objects includes the values from the SiteNo column in each row of data, indicating where the data in that row was collected.
Your result will be similar to:
NodeId 33 AmpId 0 Sequence 1 ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_HashBy/09394500/09513780/object_33_0_1.parquet ObjectSize 79398 RecordCount 11778 NodeId 33 AmpId 1 Sequence 1 ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_HashBy/09380000/09429070/object_33_1_1.parquet ObjectSize 84999 RecordCount 8926 NodeId 33 AmpId 2 Sequence 1 ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_HashBy/09396100/09400815/object_33_2_1.parquet ObjectSize 47659 RecordCount 6765 [...]
For readability, the output is displayed vertically.
-
Write all rows using HASH BY and NAMING('DISCRETE'):
SELECT NodeId, AmpId, Sequence, ObjectName, ObjectSize, RecordCount FROM WRITE_NOS_FM ( ON ( SELECT * FROM RiverFlowPerm ) HASH BY SiteNo, Conductance LOCAL ORDER BY SiteNo USING LOCATION('YOUR-STORAGE-ACCOUNT/RiverFlowPerm_HashBy_Discrete/') STOREDAS('PARQUET') COMPRESSION('GZIP') NAMING('DISCRETE') INCLUDE_ORDERING('TRUE') INCLUDE_HASHBY('TRUE') MAXOBJECTSIZE('4MB') ) AS d ORDER BY AmpId;
Replace LOCATION with the URI to the external object store where you want to write the data.
Things to note in this example:- Each row of the result set describes one object written to external storage that contains data from Vantage.
- Each object contains data grouped according to to the HASH BY clause. In this example it determines how the database table row data is distributed and grouped to different AMPs using the hash of the SiteNo column values. In this case there is one partition for the data on each AMP, which may include a combination of data rows from different collection sites.
- NAMING('DISCRETE') causes the object names to include the partition name, in this case the object names include the values for SiteNo. In the object name, /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_Discrete/09424900/object_33_0_1.parquet, "09424900" is the SiteNo of the data contained by this object.
- Because the query included INCLUDE_ORDERING('TRUE') and INCLUDE_HASHBY('TRUE'), the objects include data from the ORDER BY and HASH BY columns. In this case both are the same column, SiteNo, so the data contained in the objects includes the values from the SiteNo column in each row of data, indicating where the data in that row was collected.
Your result will be similar to:
NodeId 33 AmpId 0 Sequence 3 ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_HashBy_Discrete/09424900/object_33_0_3.parquet ObjectSize 23551 RecordCount 2946 NodeId 33 AmpId 0 Sequence 5 ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_HashBy_Discrete/09513780/object_33_0_5.parquet ObjectSize 18668 RecordCount 2941 NodeId 33 AmpId 0 Sequence 4 ObjectName /S3/s3.amazonaws.com/ie-writenos-bucket/RiverFlowPerm_HashBy_Discrete/09497500/object_33_0_4.parquet ObjectSize 27504 RecordCount 2944 [...]
For readability, the output is displayed vertically.
Prerequisites
Example: Write All Rows to External Object Store, Use HASH BY and NAMING('RANGE')
Example: Write All Rows to External Object Store, Use HASH BY and NAMING('DISCRETE')