Executing the Teradata Aster Spark Queen Installation/Configuration Script

Executing the Teradata Aster Spark Queen Installation/Configuration Script - Aster Analytics

Teradata Aster® Spark Connector User Guide

Product

Aster Analytics

Release Number

7.00.00.01

Published

May 2017

Language

English (United States)

Last Update

2018-04-13

dita:mapPath

dbt1482959363906.ditamap

dita:ditavalPath

Generic_no_ie_no_tempfilter.ditaval

dita:id

dbt1482959363906

lifecycle

Product Category

Software

The configureAsterSpark script generated the installation/configuration script AsterSpark_queen_host_name_queen.sh, which installs and configures the files that allow Aster Database nodes to use the RunOnSpark function to query the Hadoop/Spark cluster.

The AsterSpark_queen_host_name_hadoop.sh is idempotent: rerunning it does not duplicate previous work. If an item to create or copy already exists, the script reports this fact and continues.

On the queen node, run the script:

./AsterSpark_queen_host_name_queen.sh

The script:

Gives Aster Database vworkers OpenSSH access to the Hadoop Spark name node.
The vworkers run RunOnSpark queries as the user extensibility. To secure the IDENTITYFILE, the script transfers its ownership to the user extensibility. The permission settings of the IDENTITYFILE limit access to its owner.
Copies Hadoop configuration files from the Hadoop Spark name node to the vworkers.

If necessary, copies the Hadoop jar files from the Hadoop Spark name node to the vworkers.

Hadoop Distribution	Hadoop jar files in sqlh-supplied directory
HDP2.3	/home/beehive/partner/hadoop/HDP2.3
HDP2.1	/home/beehive/partner/hadoop/HDP2.1
HDP1.3.2	/home/beehive/partner/hadoop/HDP1.3.2

This step is unnecessary for HDP Hadoop distros supported by Aster SQL-H, which provides the required Hadoop jar files.

Copies Spark client files from the Hadoop Spark name node to the vworkers.
Copies Spark client configuration from the Hadoop Spark name node to the vworkers.
Creates four spark.config identifiers.
Hortonworks and Cloudera Spark can use all four spark.config identifiers in RunOnSpark queries.
Installs RunOnSpark.zip.
Tests the four spark.config identifiers on a schema that it creates and populates (and after the test, drops that schema).