16.10 - Limitations to the TDCH-TPT Interface - Parallel Transporter
Teradata Parallel Transporter Reference
- Parallel Transporter
- Release Number
- July 2017
- Content Type
- Programming Reference
- Publication ID
- English (United States)
- To utilize the TDCH-TPT interface, the node on which TPT is running must have the Hadoop client jars installed, as the DataConnector operator must be able to launch a MapReduce job via a call to the Hadoop CLI.
- The TDCH-TPT interface is only supported on the Linux platform.
- Because the DataConnector operator relies on TDCH to read from and write to Hadoop files and tables, many of the traditional DataConnector operator attributes are not supported alongside the TDCH-TPT interface. For example, when using the DataConnector producer, the FileName attribute is superseded by the TDCH-specific HadoopSourcePaths attribute. Similarly, the Format attribute is superseded by the TDCH-specific HadoopFileFormat attribute. If an unsupported attribute is submitted alongside TDCH-specific Hadoop attributes, the TPT job will fail.
- Due to TDCH's batch processing nature, many of the DataConnector operator's active data warehousing features are not supported when using the TDCH-TPT interface. For example, because the MapReduce job processes data out-of-order, the DataConnector's checkpoint/restart feature is unavailable when utilizing the TDCH-TPT interface. Similarly, because the TDCH job requires a single file or table name as an argument, the DataConnector operator's directory scan feature is unavailable when utilizing TDCH-TPT interface. If an unsupported feature is utilized alongside the TDCH-TPT interface, the TPT job will fail.
- When using the TDCH-TPT interface to process Hadoop files and tables, multiple instances of the DataConnector operator are not supported. If multiple instances of the DataConnector are defined, the TPT job will fail.