Advanced Settings | Data Mover Portlet | Teradata Data Mover - About Advanced Job Settings

Advanced Settings | Data Mover Portlet | Teradata Data Mover - About Advanced Job Settings - Teradata Data Mover

Teradata® Data Mover User Guide

Product

Teradata Data Mover

Release Number

16.20

Published

November 2021

Language

English (United States)

Last Update

2021-11-04

dita:mapPath

wph1512683331242.ditamap

dita:ditavalPath

4101_UG_dm_1620.ditaval

dita:id

B035-4101

lifecycle

Product Category

Analytical Ecosystem

You can select advanced job save options from the Job Settings tab. Click Advanced to access job performance settings. Data Mover provides default values for these settings.

Teradata Systems

For Teradata systems, you can select these advanced job save options:

Data Streams: Specifies the number of data streams to use between the source and target databases for Teradata ARC, Aster (to/from Teradata) or Teradata PT API jobs. For DSA jobs, specify the number of streams per database node. All other utilities use a single data stream.
Source Sessions: Specifies the number of sessions per data stream for the source system.
Target Sessions: Specifies the number of sessions per data stream for the target system.
Max Agents per Task: Specifies the maximum number of agents that Data Mover allocates at the same time to one task in jobs that use Teradata ARC or the Teradata PT API. If multiple agents are installed in the Data Mover environment, you can enter an integer value greater than one to improve performance for a job that copies large amounts of data. If you do not provide a value for Max Agents per Task, Data Mover dynamically calculates a value at runtime.
Force Utility: Forces Data Mover to use a specific Teradata utility or API operator for the copy job. Data Mover automatically selects the best utility for the job.
Source Character Set: Specifies the session character set that is used to communicate with the source system.
Target Character Set: Specifies the session character set that is used to communicate with the target system.
Target Group Name: Specifies a shared pipe target group to run DSA jobs instead of having Data Mover automatically select one. If the specified target group does not exist, the job fails.
Parallel Builds: Specifies the number of tables with indices that can be built concurrently when using DSA. The maximum number of concurrent builds is 5 (default value).

Teradata and Aster Systems

For Teradata to Aster and Aster to Teradata, you can select these advanced job save options:

Max Sessions

Specifies the maximum number of sessions that can be established with Teradata AMPs; this must always be greater than or equal to the number of instances (data_streams). For Aster to Teradata jobs, if you do not specify an input value for number of instances (data_streams), the value for max sessions must be greater than or equal to the vWorkers count, because Aster takes the data streams value as equal to the vWorkers count by default.

The Teradata system has to allow the specified number of sessions. For Aster to Teradata jobs, if you do not specify an input value for the max sessions, the Teradata system must allow a minimum sessions count equal to or greater than the vWorkers count. If the Teradata system does not allows that number of sessions, the job fails. If the job fails, you must update the Teradata system settings to allow the required number of sessions or create the temporary DIMENSION table with distribution by replication (if the source table is not a replicated DIMENSION table), and then move that table.

Number of Instances

Number of instances to participate in the Teradata PT import into Aster Database or export into Teradata Database. The value of this clause cannot exceed the number of virtual workers (vWorkers) in Aster Database or the number of AMPs in the Teradata Database.

For Teradata to Aster jobs, if the vWorker count is less than the AMP count, the Aster connector sets a default value of vWorker Count.
For Aster to Teradata jobs, if the AMP count is less than the vWorker's count and the Aster version is earlier than 6.10, Data Mover uses a temporary replicated DIMENSION table by default (if the source table is not a dimension table) to move the source table. If the Aster version is 6.10 or later, Aster uses automatic tuning to move the data. For more information about the Aster auto tuning feature, refer to version 6.10 or later of the Teradata Aster® Database User Guide.
For Aster to Teradata jobs, the number of instances (data_streams) must be equal to the vWorkers count, otherwise Data Mover ignores the user input value for Data streams and considers the value as the vWorkers count.

Query Timeout

Specifies the response time allowed in seconds. If not supplied, the default timeout is 30 minutes. Data Mover allows all non-negative values; however, values greater than 3600 cause the job to fail at runtime.

Preserve column case

For Teradata to Aster, specifies case for column names when loading data to Aster Database. Valid values are:

No: Default. Changes all table names to lowercase.
Yes: Keeps the existing case for column names when transferring to Aster Database.

Skip error records

For Teradata to Aster, specifies treatment of errors encountered when loading data to Aster Database. The connector can encounter an error parsing a row containing data that is valid in Teradata but is not supported in Aster Database.

No: (Default) Load is aborted upon encountering an error, and no rows are loaded to Aster Database.
Yes: Keeps the existing case for column names when transferring to Aster Database.

Teradata and Hadoop Systems

For Teradata to Hadoop and Hadoop to Teradata, you can select these advanced job save options:

Force Utility

Forces Data Mover to use a specific utility for Hadoop copy operations. The Data Mover daemon uses SQL-H to move the table. If SQL-H cannot, Teradata Connector for Hadoop (TDCH) is used to move the table.

Transfer Method

Teradata Connector for Hadoop that supports these options for data transfer from Teradata to Hadoop.

Teradata to Hadoop Option	Description
Default	Data Mover selects AMPs by default if transfer method is not specified.
Hash	The underlying Hadoop connector retrieves rows in a given hash value range of a specified split-by column from a source table in Teradata, and writes those records into a target file in HDFS.
Value	The underlying Hadoop connector retrieve rows in a given value range of a specified split-by column from a source table in Teradata, and writes those records into a target file in HDFS.
Partition	The underlying hadoop connector creates a staging PPI table on source database if the source table is not a PPI table.
Amp	The underlying Hadoop connector retrieves rows in one or more AMPs from a source table in Teradata, and writes those records into a target file in HDFS. The Amp option is supported only if the Teradata Database is version 14.10 or later.

Teradata Connector for Hadoop that supports these options for data transfer from Hadoop to Teradata.

Hadoop to Teradata Option	Description
Default	Data Mover selects internal_fastload
batch.insert	The underlying Hadoop connector inserts rows into a NOPI staging table through JDBC batch execution. After all mappers complete their processing, rows in the staging table are moved to the target table by an Insert-Select SQL operation. If batch.insert is selected, you can also specify the size property as well. The value is the number of rows (example 10000, 50000, and so on).
internal.fastload	The underlying Hadoop connector starts a database fastload session to load rows into a single NOPI staging table. All database fastload sessions are coordinated through an internal protocol. The FastLoad job finishes after all mappers complete their sessions, and then rows in the NOPI staging table are copied to the target table by an Insert-Select SQL operation.

Number of Mappers

Specifies the number of mappers Teradata Connector uses to pull data from Teradata Database.