How Listener Writes to Targets

Teradata Targets

By default, Teradata Listener uses a JDBC driver to write data to a Teradata Database.

Listener stores each streaming message in a staging table. The staging table has three metadata columns and one raw data column for each record.
In near real-time mode, Listener uses prepared SQL INSERT statements to micro-batch the data at an interval of 240ms, or 4000 records, whichever interval is reached first.
In batch mode, the JDBC driver persists all the batched records into the Teradata Database.
Listener receives acknowledgement that the data has been successfully persisted into the Teradata Database.

Listener can also write to a Teradata Database target using with passthrough or user-provided mapping. When the data ingestion rate is high, Teradata achieves high throughput. When the data ingestion rate is slow, data might be written faster without Teradata QueryGrid.

Listener supports time series data that you can write to primary time index (PTI) tables in Teradata NewSQL Engine systems. When creating a target, you can either map to the TD_TIMECODE PTI table column or allow Listener to insert the TD_TIMECODE column value as the timestamp when the record is inserted in the Teradata NewSQL Engine system. For more information about time series tables and operations, see Teradata® Database Time Series Tables and Operations.

HDFS Target with or without Kerberos

Listener writes HDFS target data in sequence file format (.seq) to the directory provided in the data_path field. In the example below, data is written to .seq files in /user/testuser/kerberos/{source_id}/.

For HDFS targets, Listener stores ingested data, even if the password is not valid.

This is a standard method for writing to HDFS targets.

{

    "target_id": "c1bd34bf-93e7-4ce2-b782-23d1c71e06d3",
    "source_id": "e750d1fe-2608-43f3-9d7d-6c1231d681a8",
    "bundle_interval": 100,
    "bundle_type": "records",
    "data_path": {
      "extension": "seq",
      "path": "/user/testuser/kerberos"
    },
    "target_type": "hdfs", 
    ....
 }

When a bundle_interval is specified (100 records in this example):

Listener collects and holds data records in a temporary directory called /user/testuser/kerberos/+tmp.
When there are 100 records in the tmp directory, Listener moves the data from the tmp directory to sequence files (.seq) in /user/testuser/kerberos/{source_id}/.
If Listener does not collect 100 bundle_interval records before the default interval of 100 seconds, it moves the data it has collected at the default interval.

Sequence files (.seq) are in key value format delimited by a tab. The key is a random UUID and not associated with the Listener UUID metadata. The value is the data ingested and the metadata appended by Listener.

Broadcast Stream Targets

A broadcast stream is a type of target that allows source owners and administrators to connect source data to external apps through a WebSocket server end-point. A broadcast stream target is limited to the number of available targets in the Manage Targets list. Teradata recommends no more than 20 connections for each broadcast stream.

How Listener Writes to Targets - Teradata Listener

Teradata® Listener™ User Guide

Teradata Targets

HDFS Target with or without Kerberos

Broadcast Stream Targets