17.10 - Example: Loading Avro Object Container Files as BLOBs Using BTEQ - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - DATASET Data Type

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B035-1198-171K
Language
English (United States)

The following example shows how to load Avro object container files as BLOBs by using BTEQ. Then, you can extract the Avro values into a table with a column of type DATASET STORAGE FORMAT AVRO by using the AvroContainerSplit table operator.

In this example, three object container files are loaded.

  1. Create an import file with two fields for BTEQ called avro_containers.txt. The first field is the container ID, and the second field is the name of the file containing the Object Container file.
    The container ID is loaded into an INTEGER column, and the file is loaded as "BLOB as deferred by name" into a BLOB. The import file contents are in vartext format:
    1|avro_1.db
    2|avro_2.db
    3|avro_3.db
  2. Create a table with a BLOB column to import the BLOB data.
    CREATE TABLE avro_containers(container_id INTEGER, container BLOB);
  3. Load the table with the object container files from BTEQ.
    .import vartext file avro_containers.txt
    
    .repeat *  
    USING (c1 VARCHAR(20), c2 BLOB AS DEFERRED BY NAME) INSERT INTO avro_containers(:c1, :c2);
  4. Create a table to hold the Avro values.
    CREATE TABLE avro_table(container_id INTEGER, avro_obj_id INTEGER, avro DATASET STORAGE FORMAT AVRO);
  5. Extract the Avro values from each container by using the AvroContainerSplit table operator.
    INSERT INTO avro_table
    SELECT T.out_container_id, T.avro_object_id, T.avro_value FROM AvroContainerSplit
        (ON (SELECT container_id, container FROM avro_containers)) T;

    The Avro values from all three container files are loaded into the table, avro_table, organized by container ID and Avro Object ID within each container ID.

  6. Select the first two Avro objects, in JSON format, from the first container.
    SELECT container_id, avro_obj_id, avro.tojson() FROM avro_table WHERE container_id = 1 AND
        avro_obj_id < 2
    ORDER BY 1,2;