Hortonworks Data Platform for Teradata® is composed of master, data, and edge nodes.
- Master Node for Hadoop
- Controls the cluster by storing metadata and running master services, including:
- Hive: Queries structured data in HDFS
- JournalNode: Modifies log changes in HDFS from the namenode
- Namenode: Manages HDFS storage; high availability requires an active and standby namenode
- YARN: Schedules application jobs and manages and allocates resources
- ZooKeeper: Synchronizes distributed components as well as monitoring the namenode
- Data Node for Hadoop
-
- Stores HDFS blocks
- Answers queries from the namenode for filesystem operations
- Allows client applications to communicate directly with the data node when the namenode determines the data location
- Edge Node for Hadoop
- The edge node allows client applications to run independently of the master node, reducing the risk in testing new applications. Located between the Hadoop cluster and the customer network, the edge node runs client services for the cluster:
- Allows access for external applications and user access to the Hadoop environment
- Permits access control
- Enforces policy oversight
- Provides fast connections by communicating to the Hadoop cluster over the internal InfiniBand network