About Advanced Job Settings - Teradata Data Mover

Teradata Data Mover User Guide

Product
Teradata Data Mover
Release Number
16.10
Published
June 2017
Language
English (United States)
Last Update
2018-03-29
dita:mapPath
kmo1482331935137.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-4101
lifecycle
previous
Product Category
Analytical Ecosystem
You can select advanced job save options from the Job Settings tab. Click Advanced to access job performance settings. Data Mover provides default values for these settings.
Teradata Systems
For Teradata systems, you can select these advanced job save options:
Data Streams
Specifies the number of data streams to use between the source and target databases for jobs that use Teradata ARC, Teradata PT API, or Aster (to/from Teradata) jobs. All other utilities use a single data stream.
Source Sessions
Specifies the number of sessions per data stream for the source system.
Target Sessions
Specifies the number of sessions per data stream for the target system.
Max Agents per Task
Specifies the maximum number of agents that Data Mover allocates at the same time to one task in jobs that use Teradata ARC or the Teradata PT API. If multiple agents are installed in the Data Mover environment, you can enter an integer value greater than one to improve performance for a job that copies large amounts of data. If you do not provide a value for Max Agents per Task, the Data Mover portlet uses a default value of 1.
Force Utility
Forces Data Mover to use a specific Teradata utility or API operator for the copy job. Data Mover automatically selects the best utility for the job.
Source Character Set
Specifies the session character set that is used to communicate with the source system.
Target Character Set
Specifies the session character set that is used to communicate with the target system.
Teradata and Aster Systems
For Teradata to Aster and Aster to Teradata, you can select these advanced job save options:
Max Sessions
Specifies the maximum number of sessions that can be established with Teradata AMPs; this must always be greater than or equal to the number of instances (data_streams). For Aster to Teradata jobs, if you do not specify an input value for number of instance (data_streams), the value for max sessions must be greater than or equal to vWorkers count, because Aster takes data streams value as equal to vWorkers count by default.
The Teradata system has to allow the specified number of sessions. For Aster to Teradata jobs, if you do not specify an input value for the max sessions, the Teradata system must allow a minimum sessions count equal to or greater than vWorkers count. If the Teradata system does not allows that number of sessions, the job fails. If the job fails, you must update the Teradata system settings to allow the required number of sessions or create the temporary DIMENSION table with distribution by replication (if the source table is not a replicated DIMENSION table), and then move that table.
Number of Instances
Number of instances that should participate in the parallel TPT import into Aster Database or export into Teradata Database. The value of this clause cannot exceed the number of virtual workers (vworkers) in Aster Database or the number of AMPs in Teradata Database.
  • For Teradata to Aster jobs, if the vWorker count is less than the AMP count, the Aster connector sets a default value of vWorker Count.
  • For Aster to Teradata jobs, if the AMP's count is less than the vWorker's count and the Aster version is earlier than 6.10, Data Mover uses a temporary replicated DIMENSION table by default (if the source table is not a dimension table) to move the source table. If the Aster version is 6.10 or later, Aster uses automatic tuning to move the data. For more information about the Aster auto tuning feature, please refer to version 6.10 or later of the Aster Database User Guide.
  • For Aster to Teradata jobs, the number of instances (data_streams) must be equal to vWorkers count, or Data Mover ignores the user input value for Data streams and considers the value as vWorkers count.
Query Timeout
Specifies the response time allowed in seconds. If not supplied, the default timeout is 30 minutes. Data Mover allows all non-negative values; however, values greater than 3600 will cause the job to fail at runtime.
Preserve column case
For Teradata to Aster, specifies case for column names when loading data to Aster Database. Values:
  • No: Default. Changes all table names to lowercase.
  • Yes: Keeps the existing case for column names when transferring to Aster Database.
Skip error records
For Teradata to Aster, specifies treatment of errors encountered when loading data to Aster Database. The connector can encounter an error parsing a row containing data that is valid in Teradata but is not supported in Aster Database.
  • No: Default. Load is aborted upon encountering an error, and no rows are loaded to Aster Database.
  • Yes: Keeps the existing case for column names when transferring to Aster Database.
Teradata and Hadoop Systems
For Teradata to Hadoop and Hadoop to Teradata, you can select these advanced job save options:
Force Utility
Forces Data Mover to use a specific utility for Hadoop copy operations. The Data Mover daemon uses SQL-H to move the table. If SQL-H cannot, Teradata Connector for Hadoop is used to move the table.
Transfer Method
Teradata Connector for Hadoop that supports these options for data transfer from Teradata to Hadoop.
Teradata to Hadoop Option Description
Default Allows Data Mover to select Hash if the source Teradata Database is 14.00 or earlier, or to select AMPs if the source Teradata Database is 14.10 or later.
Hash The underlying Hadoop connector retrieves rows in a given hash value range of a specified split-by column from a source table in Teradata, and writes those records into a target file in HDFS.
Value The underlying Hadoop connector retrieve rows in a given value range of a specified split-by column from a source table in Teradata, and writes those records into a target file in HDFS.
Partition The underlying hadoop connector creates a staging PPI table on source database if the source table is not a PPI table.
Amps The underlying Hadoop connector retrieves rows in one or more AMPs from a source table in Teradata, and writes those records into a target file in HDFS. The Amps option is supported only if the Teradata Database is 14.10 or later.
Teradata Connector for Hadoop that supports these options for data transfer from Hadoop to Teradata.
Hadoop to Teradata Option Description
Default Data Mover selects internal_fastload
batch.insert The underlying Hadoop connector insert rows into a NOPI staging table via JDBC batch execution. After all mappers complete their processing, rows in the staging table are moved to the target table by an Insert-Select SQL operation. If batch.insert is selected, you can also specify the size property as well. The value is be the number of rows (example 10000, 50000, etc.).
internal.fastload The underlying Hadoop connector starts a database fastload session to load rows into a single NOPI staging table. All database fastload sessions are coordinated via an internal protocol. The FastLoad job finishes after all mappers complete their execution, and then rows in the NOPI staging table are copied to the target table by an Insert-Select SQL operation.
Number of Mappers
Specifies the number of mappers Teradata Connector uses to pull data from Teradata Database.