start (Teradata Open Distribution for Hadoop)

Teradata Data Mover User Guide

brand
Analytical Ecosystem
prodname
Teradata Data Mover
vrm_release
16.10
category
User Guide
featnum
B035-4101-067K

Purpose

The start command starts a job that was created with the create command. You may specify different job variable values than were originally used by entering them in the command line at runtime. If the daemon does not have sufficient resources to run the job immediately, the job is queued.

All parameters in this section are specific to the CLI.

Parameters - Hadoop Specific

These parameters are specific to the start command for Hadoop when using the CLI, and are common to both Teradata to Hadoop and Hadoop to Teradata.

See Parameter Order.

hadoop_file_delimiter
[Optional] Specifies a character delimiter for columns. If not specified, comma (',') is the delimiter. This option applies only if hadoop_file_option is specified as Text.
hadoop_file_option
[Optional] Values supported by the Teradata Connector for Hadoop and T2H:
  • Text (Default)
  • RC
  • ORC
hadoop_number_mappers
[Optional] Specifies the number of mappers Teradata Connector uses to pull data from Teradata Database.
hadoop_transfer_batch_size
[Optional] If you specify batch_insert as the hadoop_transfer_method value, you can also specify this parameter as a value representing the number of rows (for example, 10000, 50000). This property is not applicable when you specify internal_fastload as the hadoop_transfer_method value.
hadoop_transfer_method
[Optional] The method Teradata Connector uses to transfer data from Hadoop to Teradata.
source_hadoop_file_system_url | target_hadoop_file_system_url
The values specified for these parameters must start with http:// followed by system name or IP address and port number. If the logon mechanism is kerberos, the host name must be the fully qualified domain name.
Value Description
http://webhdfs:50070 or http://httpfs:14000 Retrieves configuration file stored in HDFS to execute TDCH jobs and logs generated by Teradata Connector for Hadoop jobs. Specify either WebHDFS REST URL or HttpFS REST URL. Default port for WebHDFS is 50070. Default port for HttpFS is 14000.
source_hadoop_oozie_url | target_hadoop_oozie_url
The values specified for these parameters must start with http:// followed by system name or IP address and port number. If the logon mechanism is kerberos, the host name must be the fully qualified domain name.
Value Description
http://oozie:11000 Runs hive queries and Teradata Connector for Hadoop (TDCH) jobs for data movement. To construct the URL, replace oozie with the system name where the Oozie server resides. Port 11000 is the default for Oozie.
source_hadoop_webhcat_url | target_hadoop_webhcat_url
The values specified for these parameters must start with http:// followed by system name or IP address and port number. If the logon mechanism is kerberos, the host name must be the fully qualified domain name.
Value Description
http://webhcat:50111 Retrieves metadata such as databases, tables, column, and so on. To construct the URL, replace webhcat with the system name where the WebHCAT server resides. Port 50111 is the default for WebHCAT.
source_hive_logon_mechanism | target_hive_logon mechanism
The security protocol for logging in to the source or target Hadoop File System. Available values are default and kerberos.
source_hive_password | target_hive_password
Password of the user who has access to the tables in the source or target Hadoop File System.
If the logon mechanism is default, this parameter is optional. If the logon mechanism is kerberos, this parameter is required and must be the password for the hive_user.
source_hive_password_encrypted | target_hive_password encrypted
Encrypted password of the user who has access to the tables in the source or target Hadoop File System. Not a valid parameter if source_hive_password | target_hive_password is specified.
If the logon mechanism is default, this parameter is optional. If the logon mechanism is kerberos, this parameter is required and must be the password for the hive_user.
source_hive_user | target_hive_user
Name of the user who has access to the tables in the source or target Hadoop File System.
When the logon mechanism is kerberos, value for the hive user must adhere to the following the convention: kerberos_principal@kerberos_realm