Setting Up a Teradata Connector for Hadoop - Teradata Studio

Teradata Studio and Studio Express Installation Guide

Product
Teradata Studio
Release Number
16.10
Published
June 2017
Language
English (United States)
Last Update
2018-05-03
dita:mapPath
crb1486740009444.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2037
lifecycle
previous
Product Category
Teradata Tools and Utilities
Teradata Studio provides an option to transfer data to and from Hadoop systems. The Smart Loader for Hadoop feature uses the Teradata connector for Hadoop (TDCH) installed on the Hadoop node. Oozie is used for the data transfer workflow.
  1. Download and install the TDCH onto your Hadoop system from Teradata Downloads under Connectivity.
  2. Download the Configure Oozie script (configureOozie.sh) from Teradata Downloads onto your Hadoop system.
  3. Change the mode so the script is executable: chmod +x configureOozie.sh
  4. Run configureOozie.sh to remove any hidden Windows characters from the file. dos2unix configureOozie.sh
  5. Execute configureOozie.sh as a root user, providing the locations of your Hadoop services: Usage: ./configureOozie.sh nn=nameNodeHost[jt=jobTrackerHost][oozie=oozieHost] [nnPort=nameNodePortNum][jtPort=jobTrackerPortNum][ooziePort=ooziePortNum][webhcatPort=webhcatPortNum] [webhdfsPort=webhdfsPortNum] where
    Parameter Definition Value
    nameNodeHost The Name Node host name required
    jobTrackerHost The Job Tracker host name uses nn parameter value if omitted
    oozieHost The Oozie host name uses nn parameter value if omitted
    nameNodePortNum The Name node port number 8020 if omitted
    jobTrackerPortNum The Job Tracker port number 50300 if omitted
    ooziePortNum The Oozie port number 11000 if omitted
    webhcatPortNum The WebHCatalog port number 50111 if omitted
    webhdfsPortNum The WebHDFS port number 50070 if omitted
    The port numbers are HDP's defaults. If the system being set up has all the services hosted on a single system on the default ports, only the nn parameter is needed.
    The script exits with an error message if the TDCH is not in its expected location. Otherwise the script displays a message indicating the parameters values. For example:
    The following is the specification of the Hadoop services used
    by the Oozie workflows: 
    {
    "Distribution":"HDP",
    "DistributionVersion":"3.2.1",
    "WebHCatalog":"hostname",
    "WebHCatalogPort":50111,
    "WebHDFS":"hostname",
    "WebHDFSPort":50070,
    "JobTracker":"hostname",
    "JobTrackerPort":50300,
    "NameNode":"hostname",
    "NameNodePort":8020,
    "Oozie":"hostname",
    "OoziePort":11000
    }