16.10 - Limitations to the TDCH-TPT Interface - Parallel Transporter

Teradata Parallel Transporter Reference

Product
Parallel Transporter
Release Number
16.10
Published
July 2017
Content Type
Programming Reference
Publication ID
B035-2436-077K
Language
English (United States)
  • To utilize the TDCH-TPT interface, the node on which TPT is running must have the Hadoop client jars installed, as the DataConnector operator must be able to launch a MapReduce job via a call to the Hadoop CLI.
  • The TDCH-TPT interface is only supported on the Linux platform.
  • Because the DataConnector operator relies on TDCH to read from and write to Hadoop files and tables, many of the traditional DataConnector operator attributes are not supported alongside the TDCH-TPT interface. For example, when using the DataConnector producer, the FileName attribute is superseded by the TDCH-specific HadoopSourcePaths attribute. Similarly, the Format attribute is superseded by the TDCH-specific HadoopFileFormat attribute. If an unsupported attribute is submitted alongside TDCH-specific Hadoop attributes, the TPT job will fail.
  • Due to TDCH's batch processing nature, many of the DataConnector operator's active data warehousing features are not supported when using the TDCH-TPT interface. For example, because the MapReduce job processes data out-of-order, the DataConnector's checkpoint/restart feature is unavailable when utilizing the TDCH-TPT interface. Similarly, because the TDCH job requires a single file or table name as an argument, the DataConnector operator's directory scan feature is unavailable when utilizing TDCH-TPT interface. If an unsupported feature is utilized alongside the TDCH-TPT interface, the TPT job will fail.
  • When using the TDCH-TPT interface to process Hadoop files and tables, multiple instances of the DataConnector operator are not supported. If multiple instances of the DataConnector are defined, the TPT job will fail.