15.10 - Step 2 - Select a Source and Select Data - Parallel Transporter

Teradata Parallel Transporter User Guide

prodname
Parallel Transporter
vrm_release
15.10
category
User Guide
featnum
B035-2445-035K

Step 2 - Select a Source and Select Data

Use one of the following procedures, depending on the data source for the job.

  • Teradata Table as a Data Source
  • File as a Data Source
  • Oracle Table as a Data Source
  • ODBC-Compliant Database as a Data Source
  • Logging onto a Data Source

    When using a Teradata table, an Oracle table or an ODBC-compliant database as a data source, a Logon dialog box appears to prompt for name, User ID and Password.

    The Logon dialog box appears when creating a new script or editing an existing script. Logon information can be included in the Wizard scripts.

    After supplying this information, the Teradata PT Wizard attempts to log on. If the connection can not be made, a message appears.

    When running existing scripts, if the logon information has not been included in a script that has been submitted to run, information can be entered in the JobAttributes panel in the Run dialog box, as shown under step 4 on page 300.

    Teradata Table as a Data Source

    Use the Teradata Table option from the Job Source dialog box to log onto your Teradata system.

    Then select a specific table as a data source for a job.

    The Teradata Logon dialog box appears, optionally allowing the User ID and Password to be included in the Wizard job.

    1 From the Source Type list in the Job Source dialog box, select Teradata Table.

    2 In the Teradata Logon dialog box, type the host name, user ID, and password to log on to your Teradata system.

    3 (Optional) Select the check boxes to include your user ID and password in the generated scripts. The default is to enter placeholders. Select the Enable Logon Encryption check box to enable encryption of login information passed to the database. By default, logon encryption is disabled.

    4 Click OK.

    The Job Source dialog box displays the directory structure of the Teradata system you logged onto.

    5 In the left pane, select a database and a table to be the data source for the job.

    Notice:
    Do not select tables that contain character large object (CLOB) or binary large object (BLOB), JSON, or XML data types.

    6 In the right pane, select up to 450 columns to include in the source schema, or click Select All or Select None. (Press Ctrl+click to select multiple columns.)

    If a column name from a source table is a Teradata PT reserved word, the Teradata PT Wizard appends the phrase “_#” (where # is a numeric) so that the name differs from the keyword and the submitted script does not receive a syntax error.

    For example, if the keyword DESCRIPTION is used as a column name, the name is changed to DESCRIPTION_1. Teradata PT keeps an internal counter for generating the appended number.

    For the complete list of Teradata PT reserved words, see Teradata Parallel Transporter Reference.

    Note: The values under TPT Type are names of the data types associated with the Teradata PT columns. The values under DBS Type are the data types from the source database. When Teradata PT gets a column name from a source table, it looks at the definition schema of the table to determine an accurate data type. Sometimes these types can be recorded incorrectly or as a “?” when the Wizard cannot properly determine the data type. This often occurs when reading user-defined data types (UDTs).

    To change or correct a Teradata PT data type, click Edit Type (or right-click), and select the correct data type from the shortcut menu. Enter the length, precision, and scale if applicable. The precision and scale data types are only available when Decimal/Numeric is selected.

    7 Click Next to open the Job Destination dialog box.

    8 Continue with Step 3 - Select a Destination.

    File as a Data Source

    Use the File option from the Job Source dialog box to browse for a flat file to use as the data source for a job.

    1 From the Source Type list in the Job Source dialog box, select File.

    2 Do one of the following:

  • In Directory Name and File Name, type the path and name of the file to be used as the data source for the job.
  • Click Select to browse for the source file.
  • 3 In Format, select either Binary, Delimited, Formatted, Text, or Unformatted as the format associated with the file.

    For more information, see “Input File Formats”.

    Formats include:

  • Format = 'Binary' - Each record contains a two-byte integer data length (n) followed by n bytes of data.
  • Format = 'Text' - Each record is entirely character data, an arbitrary number of bytes followed by one of the following end-of-record markers:
  • A single-byte line feed (X'0A') on UNIX platforms
  • A double-byte carriage-return/line-feed pair (X'0D0A') on Windows platforms
  • Format = 'Delimited' - Each record is in variable-length text record format, but they contain fields (columns) separated by a delimiter character, as defined with the TextDelimiter attribute, which has the following limitations:
  • It can only be a sequence of characters.
  • It cannot be any character that appears in the data.
  • It cannot be a control character other than a tab.
  • Format = 'Formatted' - Each record is in a format traditionally known as FastLoad or Teradata format, which is a two-byte integer (n) followed by n bytes of data, followed by an end-of-record marker (X'0A' or X'0D0A).
  • Format = 'Unformatted' - The data does not conform to any predefined format. Instead, the data is entirely described by the columns in the schema definition of the DataConnector operator.
  • If specifying Delimited format, type the delimiter used in the source file into the Delimiter box. The wizard accepts delimiters up to 100 bytes in length. If no delimiter is provided, the TextDelimiter attribute defaults to the pipe character ( | ).

    Note: When using a delimited flat file for input, all of the data types in the DEFINE SCHEMA must be VARCHARs. Defining non-VARCHAR data types results in an error when a job script is submitted to run.

    4 (Optional) Select Indicator Mode to include indicator bytes at the beginning of each record. (Unavailable for delimited data.)

    Note: If the file name contains a wildcard character (*), two additional input boxes are available. Type the number of minutes for a job to wait for additional data in the Vigil Elapsed Time box. Type the number of seconds to wait before Teradata PT checks for new data in Vigil Wait Time box.

    5 Click Next to open the Define Columns dialog box.

    6 In the Define Columns dialog box, specify the following, as needed:

  • Name - Type the names of the columns in the source file.
  • Type - Type the data type of each column. (Choices change depending on the type of format selected in the previous dialog box.)
  • Note: When working with data from a file of delimited data, all fields must be defined as type VARCHAR.

  • Size - Type the number of characters associated with each CHAR, VARCHAR, GRAPHIC, and VARGRAPHIC data types; and type the number of bytes associated with each BYTE and VARBYTE types. (All others are unavailable.)
  • 7 (Optional) In Number of Instances, type the number of producer operator instances to process at the same time.

    8 The Precision and Scale columns are only available for Decimal data types. Under Precision, type the number of digits to the left of the decimal point; under Scale, type the number of digits to the right of the decimal position. Otherwise, go to the next step.

    9 After defining all the columns, click Next to open the Job Destination dialog box.

    10 Continue with Step 3 - Select a Destination.

    Oracle Table as a Data Source

    Use the Oracle Table option from the Job Source dialog box to log onto an Oracle server and select a specific table as a data source. The Oracle Logon dialog appears, optionally allowing the User ID and Password to be included in the Wizard job.

    1 From the Source Type list in the Job Source dialog box, click Oracle Table.

    2 At the logon prompt, type the TSN name (a net service name that is defined in a TNSNAMES.ORA file or in the Oracle directory server, depending on how the Oracle net service is configured on the Oracle client and server), user ID, and the password needed to build the Oracle JDBC connection.

    Notice:
    The value you enter into the TSN Service Name box at logon is the value that the Wizard uses for the DSNname attribute in all scripts; however, systems are often configured with different values for the TSN Service Name and DSN name. If this is the case, you must manually edit the value of the DSNname attribute in scripts to match the TSN Service Name before submitting a job script that involves an Oracle server.

    3 (Optional) Select the check boxes to include your user ID and password in the generated scripts. The default is to enter placeholders.

    4 Click OK.

    The Job Source dialog box displays the directory structure of the active Oracle server.

    5 From the directory tree in the left pane, select a database and table that are the source of data for the job.

    Notice:
    Do not select tables that contain character large object (CLOB) or binary large object (BLOB), JSON or XML data types.

    6 In the right pane, select up to 450 columns to be included in the source schema, or click Select All or Select None.

    Note: The values under TPT Type are names of the data types associated with the Teradata PT columns; the values under DBS Type are the data types from the source database. When Teradata PT gets a column name from a source table, it looks at the definition schema of the table to determine an accurate data type. Sometimes these types can be recorded incorrectly or as a “?” when the Wizard cannot properly determine the data type. This often occurs when reading user-defined data types (UDTs).

    To change or correct a Teradata PT data type, click Edit Type (or right-click), and select the correct data type from the shortcut menu. You can also enter the length, precision, and scale if it is applicable, but the precision and scale data types only appear when Decimal/Numeric is selected.

    7 Click Next to open the Job Destination dialog box.

    8 Continue with Step 3 - Select a Destination.

    ODBC-Compliant Database as a Data Source

    Use the ODBC DSN option from the Job Source dialog box to log onto an ODBC-compliant database. Then select a specific table as a data source for a job.

    The ODBC Logon dialog box appears, optionally allowing the User ID and Password to be included in the Wizard job.

    1 From the Source Type list in the Job Source dialog box, select ODBC DSN.

    2 In the ODBC Logon dialog box, type the host name, user ID, and password to log on.

    3 (Optional) Select the check boxes to include your user ID and password in the generated scripts. The default is to enter placeholders.

    4 Click OK.

    The Job Source dialog box displays the database and table hierarchy of the ODBC-compliant data source you logged onto.

    5 In the left pane, select a database and a table as the data source for the job.

    Notice:
    Do not select tables that contain character large object (CLOB) or binary large object (BLOB), JSON or XML data types.
    Notice:
    In the right pane, select up to 450 columns to be included in the source schema, or click Select All or Select None. (Press Ctrl+click to select multiple columns.)
    Notice:
    The values under TPT Type are names of the data types associated with the Teradata PT columns; the values under DBS Type are the data types from the source database. When Teradata PT gets a column name from a source table, it looks at the definition schema of the table to determine an accurate data type. Sometimes these types can be recorded incorrectly or as a “?” when the Wizard cannot properly determine the data type. This often occurs when reading user-defined data types (UDTs).

    To change or correct a Teradata PT data type, click Edit Type (or right-click), and select the correct data type from the shortcut menu. You can also enter the length, precision, and scale if it is applicable, but the precision and scale data types are only available when Decimal/Numeric is selected.

    6 Click Next to open the Job Destination dialog box.

    7 Continue with Step 3 - Select a Destination.