Executing the Teradata Aster Spark Queen Installation/Configuration Script - Aster Analytics

Teradata AsterĀ® Spark Connector User Guide

Product
Aster Analytics
Release Number
7.00.00.01
Published
May 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
dbt1482959363906.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
dbt1482959363906
lifecycle
previous
Product Category
Software

The configureAsterSpark script generated the installation/configuration script AsterSpark_queen_host_name_queen.sh, which installs and configures the files that allow Aster Database nodes to use the RunOnSpark function to query the Hadoop/Spark cluster.

The AsterSpark_queen_host_name_hadoop.sh is idempotent: rerunning it does not duplicate previous work. If an item to create or copy already exists, the script reports this fact and continues.
  1. On the queen node, run the script:
    ./AsterSpark_queen_host_name_queen.sh
    The script:
    1. Gives Aster Database vworkers OpenSSH access to the Hadoop Spark name node.

      The vworkers run RunOnSpark queries as the user extensibility. To secure the IDENTITYFILE, the script transfers its ownership to the user extensibility. The permission settings of the IDENTITYFILE limit access to its owner.

    2. Copies Hadoop configuration files from the Hadoop Spark name node to the vworkers.
    3. If necessary, copies the Hadoop jar files from the Hadoop Spark name node to the vworkers.
      Hadoop Distribution Hadoop jar files in sqlh-supplied directory
      HDP2.3 /home/beehive/partner/hadoop/HDP2.3
      HDP2.1 /home/beehive/partner/hadoop/HDP2.1
      HDP1.3.2 /home/beehive/partner/hadoop/HDP1.3.2

      This step is unnecessary for HDP Hadoop distros supported by Aster SQL-H, which provides the required Hadoop jar files.

    4. Copies Spark client files from the Hadoop Spark name node to the vworkers.
    5. Copies Spark client configuration from the Hadoop Spark name node to the vworkers.
    6. Creates four spark.config identifiers.

      Hortonworks and Cloudera Spark can use all four spark.config identifiers in RunOnSpark queries.

    7. Installs RunOnSpark.zip.
    8. Tests the four spark.config identifiers on a schema that it creates and populates (and after the test, drops that schema).