16.20 - FNC_TblOpSetFormat - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL External Routine Programming

Product
Advanced SQL Engine
Teradata Database
Release Number
16.20
Release Date
April 2020
Content Type
Programming Reference
Publication ID
B035-1147-162K
Language
English (United States)

Purpose

Sets attributes of the format of the input and output streams. This allows the contract function to specify the format of the data types to the parser.

Syntax

void
FNC_TblOpSetFormat(char                 *attribute,
                   int                   streamno,
                   Stream_Direction_en   direction,
                   void                 *inValue,
                   int                   inSize);
char *attribute
IN parameter.

The format attribute to be set.

Valid attributes are as follows:
  • "RECFMT"
  • "TZTYPE"
  • "CHARSETFMT"
  • "REPUNSPTCHR"

"CHARSETFMT" and "REPUNSPTCHR" apply only to import table operators.

int streamno
INparameter.

The stream number.

Stream_Direction_en direction
IN parameter.

The stream direction: 'R' or 'W'.

void *inValue
IN parameter.

The location of the new value of the format attribute.

int inSize
IN parameter.

The size in bytes of the new value pointed by inValue.

Usage Notes

  • This routine is valid only when called within the contract function of a table operator.
  • For "RECFMT" the default value is INDICFMT1, where the format is IndicData with row separator sentinels. When the format attribute is "RECFMT", the inValue buffer should have a value of type Stream_Fmt_en. All field-level formats impact the entire record.
  • If data being imported from a foreign server contains characters unsupported by the database, you must use FNC_ TblOpSetFormat and explicitly set "CHARSETFMT" and "REPUNSPTCHR" attributes.

Format Attribute Values

Format Attribute Description
"RECFMT"
Defines the record format. When the format attribute is "RECFMT", the inValue buffer should have a value of type Stream_Fmt_en. The Stream_Fmt_en enumeration is defined in the sqltypes_td.h header file with the following values:
  • INDICFMT1 = 1

    IndicData with row separator sentinels.

  • INDICBUFFMT1 = 2

    IndicData with NO row or partition separator sentinels.

"TZTYPE" Used as an indicator to the database to receive from or send TIME/TIMESTAMP data to the table operator in a different format.
  • RAW = 0 as stored on the database file system
  • UTC = 1 as UTC
"CHARSETFMT"
  • EVLDBC

    Signals that neither data conversion nor detection is needed.

  • EVLUTF16CHARSET

    Signals that the external data to be imported into the database are in UTF16 encoding.

  • EVLUTF8CHARSET

    Signals that the external data to be imported into the database are in UTF8 encoding.

"REPUNSPTCHR" A boolean value that specifies what to do when an unsupported unicode character is detected in the external data to be imported into the database.
  • true

    Replaces the unsupported character with U+FFFD.

  • false

    Return an error when an unsupported character is detected. This is the default behavior.

Importing and Exporting TIME/TIMESTAMP Data

You can map the database TIME and TIMESTAMP data types to the Hadoop STRING or the Oracle TIMESTAMP data type when importing or exporting data to these foreign servers.

The table operator can use FNC_TblOpSetFormat to set the tztype attribute as an indicator to the database to receive from or send TIMESTAMP data to the table operator in a native but adjusted format.

The tztype attribute is set as follows for the import and export operators:
  • For Hadoop, the attribute is set to UTC.
  • For Oracle, the attribute is set to UTC.

If the transform is off, the data will be transferred in Raw form which is the default for table operators and is consistent with standard UDFs.

tztype is a member of the structure FNC_FmtConfig_t defined in fnctypes.h as follows:
typedef struct
{
   int Stream_Fmt_en recordfmt; //enum - indicdata, fastload binary, delimited
   bool inlinelob; //inline or deferred
   bool UDTTransformsOff; //true or false
   bool PDTTransformsOff; //true or false
   bool ArrayTransformsOff; //true or false
   char auxinfo[128]; //For delimited text can contain the record separator, delimiter
                         //specification and the field enclosure characters
   double inperc; //recommended percentage of buffer devoted to input rows
   bool inputnames; //send input column names to step
   bool outputnames; //send output column names to step
   TZType_en tztype; //enum - Raw or UTC
   int charsetfmt; // charset format of data to be imported into TD through QG
   bool replUnsprtedUniChar; /* true - replace unsupported unicode character
                                encountered with U+FFFD when data is imported
                                into TD through QG
                                false - error out when unsupported unicode
                                char encountered */
} FNC_FmtConfig_t;
TZType_en is defined as follows:
typedef enum
{
   Raw = 0, /* as stored on TD File system */
   UTC = 1, /* as UTC */
} TZType_en;

For export, FNC_TblOpSetInputColTypes is called during the contract phase in the resolver and will use the tztype attribute to add the desired cast to the input TIME or TIMESTAMP column types.

The database converts the TIME and TIMESTAMP data to the session local time before casting to the character type, so when a TIME or TIMESTAMP column is being mapped to charfix/charvar as when mapping to the Hadoop STRING type, the data will transmit in session local time zone and no explicit casts are needed.

For import, when getting the input buffer from the table operator, TIME or TIMESTAMP data have to be converted to Raw form. There is no conversion needed for the import of Hadoop Strings to the database TIME or TIMESTAMP data types since it follows the normal conversion path from character to TIME/TIMESTAMP in the database.

Teradata does not recommend importing or exporting TIME/TIMESTAMP data for a database system with timedatewzcontrol flag 57 = 0. For such systems, the TIME/TIMESTAMP data is stored in OS local time. The System/Session time zone is not set and the database does not apply any conversions on TIME/TIMESTAMP data when reading or writing from disk. Therefore, exporting such data reliably in the format desired by the foreign server is a problem and Teradata recommends that the Teradata-to-Hadoop connector feature not be used for such systems.