- Create a .cluster_config file in the /home/beehive directory on the master (queen) node in order to specify the makeup of the Aster instance.
Include this content in the file:
PRIMARY_IP=ip addr Required The IP Address of the coordinator (queen) node for the Aster instance on the network. Installation fails in any other case. NUM_WORKERS=integer Required The number of worker nodes in the Aster instance.
NOTE: By default, if the number of data nodes listed in the output of the yarn node --list --all command is equal to the NUM_WORKERS argument, asteryarn attempts to bring up the workers on all nodes listed in the output of the yarn node --list --all command. In addition, there is also a dependency on the NUM_PARTITIONS argument as well. Aster attempts to allocate the minimum number of partitions on each node, so if the number of partitions is less than the number of workers, some nodes will not be used.
NUM_PARTITIONS=integer Required The total number of active partitions (vWorkers) in the Aster instance, which should be an integral multiple of the number of worker nodes. If it is not an integral multiple it will be rounded down to the nearest multiple. DB_CHECKSUMS=on Optional Indicates the database is checksummed. The values are off or on. The default value is on. TARGET_RF=1 Optional Replication factor. The value is 1. DB_STORAGE=hdfs Required Persistent storage is stored in HDFS. The value is hdfs. HDFS_NAMENODE=value Required This parameter is required if running on HDFS.
If the High Availability (HA) feature is NOT enabled on HDFS, use the IP address of the HDFS namenode.If the HA feature is enabled on HDFS, use the contents of the dfs.nameservices <value> tag as the HDFS_NAMENODE value instead of the HDFS namenode IP address. For Example:
<property> <name>dfs.nameservices</name> <value><your HDFS service name></value> </property>
dfs.nameservices is located in the hdfs-site.xml file on any HDFS node. The hdfs-site.xml file is typically located in the /etc/hadoop/conf directory.
WORKER_FQDN_LIST=string comma separated list Optional If you don't want to deploy Aster workers on some of the nodes listed in the output of the yarn node --list --all command, use this parameter to specify on what nodes the Aster workers are deployed.Valid values include any combination of FQDNs, Hostnames or an '*'. For example:
WORKER_FQDN_LIST=hrh012d1,* WORKER_FQDN_LIST=hrh012d1, hrh012d2, hrh012d3 WORKER_FQDN_LIST=hrh012d1.labs.teradata.com, hrh012d2.labs.teradata.com, hrh012d3.labs.teradata.com,*WORKER_FQDN_LIST configuration parameter validation considerations:
- The number of entries in the comma separated list must be less than or equal to the NUM_WORKERS argument.
- If the number of entries are less than the NUM_WORKERS argument, the comma separated list must have at least one '*'.
- If invalid hostnames are used, the Application master will fail at the cluster startup time.
NTP_SERVER=value Optional This is an optional parameter that, when specified, identifies the contents of the <value> tag as the NTP server. For example:
SQLMR_JAVA_MAX_MEMORY_IN_GB=value Optional This is optional parameter that represents the amount of memory to be used for JVM that would run on all worker to run java-based Teradata Aster SQL-MapReduce® or Teradata Aster SQL-GR™ functions.
When an Aster instance is activating, it will distribute this information to all Aster workers.
A lower value for this parameter helps increase concurrency in the YARN environment resulting in more queries that can be run at the same time.
An Aster instance uses this formula for calculating concurrency:asterYarnConcurrency = (YarnContainerMemoryInMB)/(sqlmrJavaMaxMemoryInMB) asterConcurrency = min(asterYarnConcurrency, asterInitialConcurrency)The asterConcurrency result becomes the Aster concurrency following execution of the ncli system softstartup or the ncli system softrestart commands.Any attempt to set more concurrency than what is automatically set may result in the shutdown of the Aster instance, in the AX 7.00.02 version of the Aster Execution Engine software, when the Aster yarn-container uses more resource than it is allocated.
Example .cluster_config file contents:
PRIMARY_IP=184.108.40.206 TARGET_RF=1 DB_CHECKSUMS=on DB_STORAGE=hdfs HDFS_NAMENODE=220.127.116.11 NUM_WORKERS=3 NUM_PARTITIONS=6 WORKER_FQDN_LIST=hrh012d1,* NTP_SERVER=18.104.22.168 SQLMR_JAVA_MAX_MEMORY_IN_GB=16