- Data Formats
The input source data can be either in text or binary format:
- Text format is a data source containing characters for display on an ASCII terminal.
- Binary format is numbers in hexadecimal
- Unformatted Records
When UNFORMATTED is specified, Teradata FastLoad assumes nothing concerning the structure of the data, end-of-record delimiters, special characters and field length indicators. The input data can be either text or binary:
- Use both an INSERT statement and a DEFINE command to define the fields
- For binary data, manually insert the indicator bytes preceding each record
Teradata FastLoad then uses the DEFINE clause as a guide to calculate the actual length of each record.
Data that is extraneous and not intended for use can be defined as CHAR.
For ASCII data, line ending characters can differ from platform to platform. For example, some systems might only use a carriage return character, while others might use both a carriage return and a line feed character to end a line. Always consider the platform-dependent characteristics when reading ASCII data from a text file.
- VARTEXT Records
When VARTEXT is specified, Teradata FastLoad assumes that the input data is variable-length text fields separated by a field delimiter character. The utility parses each input data record on a field-by-field basis, and creates a VARCHAR field for each input text field.
- Data Type Specifications
When using the VARTEXT specification, VARCHAR and VARBYTE are the only valid data type specifications which can be used in the Teradata FastLoad DEFINE command.
- Null Fields
Two consecutive delimiter characters direct Teradata FastLoad to null the field corresponding to the one right after the first delimiter character.
Also, if the last character in a record is a delimiter character, and yet there was at least one more field to be processed, Teradata FastLoad nulls the field corresponding to the next one to be processed, as defined in the DEFINE command.
- Input Record Requirements
The total number of fields in each input record must be equal to or greater than the number of fields described in the Teradata FastLoad DEFINE command. If the total number is less, Teradata FastLoad generates an error message. If the total number is more, the database ignores the extra fields.
- Error Record HandlingWhen Teradata FastLoad encounters an error condition in an input record, it normally discards the record and terminates. When loading variable-length text records, either or both of these functions can be inhibited by specifying the error-handling options:
- DISPLAY_ERRORS
- NOSTOP
By specifying both options and redirecting STDERR to a file location instead of the terminal screen, the Teradata FastLoad job will run to completion and save all the error records. They can then be manually modified using another utility such as BTEQ or MultiLoad to load them into the table.
- Data Type Specifications
- Variable-length FieldsWhen using variable-length fields in either formatted or unformatted records, either:
- Include a two-byte binary integer indicator immediately preceding each variable-length field. Teradata FastLoad uses this indicator to determine the exact length of the field.
- Pad each variable-length field with blanks to produce fixed-length fields
In either case, the maximum field length as shown in the table definition cannot be exceeded.
- DEFINE and INSERT Specifications
Use VARCHAR specifications in the DEFINE command and INSERT statements for variable-length data:
User.Table Definition Name Type Size Co1001 Integer 4 bytes Co1002 Varchar(8) up to 8 bytes Co1003 Date 4 bytes define Co1001 (integer), Co1002 (Varchar(8)), Co1003 (date) file = file_path ; insert into User.Table values ( :Co1001, :Co1002, :Co1003 ) ;
To pad a variable-length field to the maximum used in the table definition (in this case eight bytes) define column 2 as Char(8) with the table definition remaining Varchar(8).
The following table (User.Table) contains three columns of fixed-length data types. Each record has four bytes as an integer, followed by eight bytes of characters, and then four bytes of a date in integer format:
User.Table Definition Name Type Size Co1001 Integer 4 bytes Co1002 Char(8) 8 bytes Co1003 Date 4 bytes
Assuming that the fields in the record correspond exactly to the table columns, the DEFINE command and INSERT statement specifications would be as follows:
define Co1001 (integer), Co1002 (char(8)), Co1003 (date) file = file_path ; insert into User.Table values ( :Co1001, :Co1002, :Co1003 ) ; Co1001 Co1002 Co1003 |00030506|4549474854202020|000CFD1F
The DEFINE and INSERT specifications to define undesirable data (such as special control characters or carriage returns using HEX 0A as end-of-record delimiters) would be:
defineDummy(char(8)), Co1001 (integer), Co1002 (char(8)), Co1003 (date), Newline(char(1)) file = file_path ; insert into User.Table values ( :Co1001, :Co1002, :Co1003 ) ; Control Char Co1001 Co1002 Co1003 0FCA037CB86BFF8A|00030506|4549474854202020|000CFD1F|0A New line