If you have previously installed and then uninstalled an Aster instance and the Hadoop cluster configuration has not changed, you do not need to repeat these YARN configuration changes, but you may want to confirm that the required parameters are still present and in effect.
- Access the ambari-server.
-
Determine if the YARN configuration settings meet the memory requirements for running an Aster instance on the Hadoop cluster:
-
Locate the memory allocated for all YARN containers on a node. In the Ambari web interface, select A, which is used later in this procedure for computation purposes. In the example below, the value of A is 12GB.
. In the Memory section under the Node heading, note the memory allocated for all YARN containers on a node value as value
-
Determine the max memory required for map-reduce jobs. In the Ambari web interface, select B, which is used later in this procedure for computation purposes. In the example below, the value of B is 5GB.
. In the MapReduce section under the MapReduce Framework heading, determine the max(MapMemory, ReduceMemory). Note the max(MapMemory, ReduceMemory) value as value
- Determine the number of vWorkers you want to configure per Worker node. Note the number of vWorkers value as value C, which is used later in this procedure for computation purposes. As an example, the value of C is 2. This number is also used later to calculate the NUM_PARTITIONS parameter when you configure the /home/beehive/.cluster_config file.
-
Determine the memory to be configured per vWorker. Note the memory to be configured per vWorker value as value D which is used later in this procedure for computation purposes. As an example, the value of D is 2GB.
Teradata Aster recommends 32768 (32GB) per vWorker. If you can not allocate 32GB per vWorker due to memory limitations, try a smaller value.
This value is also used later as the containerMemoryInMB parameter when you define values in the/home/beehive/config/asteryarn.cfg file.
-
The Aster YARN application Master runs in a container on one of the data nodes and needs memory allocated. We request 512MB of memory for the Aster YARN application Master, but what gets allocated is max(512MB, X), where X is the value of the Minimum Container Size (Memory). To locate the value for X, access in the Memory section under the Container heading. Note the max(512MB, X) value as value E, which is used later in this procedure for computation purposes. In the example below, X is 512MB, so the value of E (max(512MB, X)) is 512MB.
-
Using the values for A, B, C, D, and E in the below memory requirements equation, determine if the values satisfy the equation, so that memory requirements are met for running an Aster instance on the Hadoop cluster.
A >= (B + C*D + E)
For example: 12000>=(5000+2000*2+512)
-
Locate the memory allocated for all YARN containers on a node. In the Ambari web interface, select A, which is used later in this procedure for computation purposes. In the example below, the value of A is 12GB.
. In the Memory section under the Node heading, note the memory allocated for all YARN containers on a node value as value
-
If memory requirements are not met for running an Aster instance on the Hadoop cluster, perform one or more of the following actions in order to satisfy the equation:
- Reduce the value for Minimum Container Size (Memory) at When determining the “Minimum container size”, consider if you will run other jobs in addition to Aster YARN and Hive (Map-reduce).
in the Memory section under the Container heading.
- Increase the value for Memory allocated for all YARN containers on a node, which is the A value, at in the Memory section under the Node heading.
- Reduce the memory to be configured per vWorker, which is the D value.
- Reduce the value for Minimum Container Size (Memory) at
-
Confirm the YARN configuration settings meet the CPU requirements for running an Aster instance on the Hadoop cluster:
-
Locate the number of virtual cores. In the Ambari web interface, select A, which is used later in this procedure for computation purposes.
. In the CPU section under the Node heading, note the number of virtual cores as value
- The Minimum container size (virtual cores) is 1.
-
Locate the maximum number of virtual cores. In the Ambari web interface, select B, which is used later in this procedure for computation purposes.
In the CPU section under the Container heading, note the Maximum Container Size (VCores) as value
- Determine the number of vWorkers you want to configure per Worker node. Note the number of vWorkers value as value C, which is used later in this procedure for computation purposes. This number is also used later to calculate the NUM_PARTITIONS parameter when you configure the /home/beehive/.cluster_config file.
-
Determine number of virtual cores you want to configure per vWorker. Note the number of virtual cores value as value D, which is used later in this procedure for computation purposes.
Teradata recommends four virtual cores per vWorker and that one vWorker configured with four virtual cores can manage 500GB of data and ten concurrent users on a supported Hadoop cluster. If you cannot allocate four virtual cores per vWorker due to limitations, allocate a fewer number of virtual cores per vWorker.
The number of virtual cores is also used later as the containerVCore parameter when you configure the /home/beehive/config/asteryarn.cfg file.
-
Using the values for A, B, C and D, in the below CPU requirements equations, verify that the values evaluate correctly in the equations.
A>= C*D
B >= C*D
Note that (B - C*D) cores are used for the remaining applications. If Hive (Map-reduce) becomes hung, try decreasing C and D.
-
Locate the number of virtual cores. In the Ambari web interface, select A, which is used later in this procedure for computation purposes.
. In the CPU section under the Node heading, note the number of virtual cores as value
- Restart all YARN services.