Example: Loading Avro Object Container Files as BLOBs Using BTEQ - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The following example shows how to load Avro object container files as BLOBs by using BTEQ. Then, you can extract the Avro values into a table with a column of type DATASET STORAGE FORMAT AVRO by using the AvroContainerSplit table operator.

In this example, three object container files are loaded.

  1. Create an import file with two fields for BTEQ called avro_containers.txt. The first field is the container ID, and the second field is the name of the file containing the Object Container file.
    The container ID is loaded into an INTEGER column, and the file is loaded as "BLOB as deferred by name" into a BLOB. The import file contents are in vartext format:
    1|avro_1.db
    2|avro_2.db
    3|avro_3.db
  2. Create a table with a BLOB column to import the BLOB data.
    CREATE TABLE avro_containers(container_id INTEGER, container BLOB);
  3. Load the table with the object container files from BTEQ.
    .import vartext file avro_containers.txt
    
    .repeat *  
    USING (c1 VARCHAR(20), c2 BLOB AS DEFERRED BY NAME) INSERT INTO avro_containers(:c1, :c2);
  4. Create a table to hold the Avro values.
    CREATE TABLE avro_table(container_id INTEGER, avro_obj_id INTEGER, avro DATASET STORAGE FORMAT AVRO);
  5. Extract the Avro values from each container by using the AvroContainerSplit table operator.
    INSERT INTO avro_table
    SELECT T.out_container_id, T.avro_object_id, T.avro_value FROM AvroContainerSplit
        (ON (SELECT container_id, container FROM avro_containers)) T;

    The Avro values from all three container files are loaded into the table, avro_table, organized by container ID and Avro Object ID within each container ID.

  6. Select the first two Avro objects, in JSON format, from the first container.
    SELECT container_id, avro_obj_id, avro.tojson() FROM avro_table WHERE container_id = 1 AND
        avro_obj_id < 2
    ORDER BY 1,2;