The configureAsterSpark script generated the installation/configuration script AsterSpark_queen_host_name_queen.sh, which installs and configures the files that allow Aster Database nodes to use the RunOnSpark function to query the Hadoop/Spark cluster.
The AsterSpark_queen_host_name_hadoop.sh is idempotent: rerunning it does not duplicate previous work. If an item to create or copy already exists, the script reports this fact and continues.
-
On the queen node, run the script:
./AsterSpark_queen_host_name_queen.sh
The script:- Gives Aster Database vworkers OpenSSH access to the Hadoop Spark name node.
The vworkers run RunOnSpark queries as the user extensibility. To secure the IDENTITYFILE, the script transfers its ownership to the user extensibility. The permission settings of the IDENTITYFILE limit access to its owner.
- Copies Hadoop configuration files from the Hadoop Spark name node to the vworkers.
- If necessary, copies the Hadoop jar files from the Hadoop Spark name node to the vworkers.
Hadoop Distribution Hadoop jar files in sqlh-supplied directory HDP2.3 /home/beehive/partner/hadoop/HDP2.3 HDP2.1 /home/beehive/partner/hadoop/HDP2.1 HDP1.3.2 /home/beehive/partner/hadoop/HDP1.3.2 This step is unnecessary for HDP Hadoop distros supported by Aster SQL-H, which provides the required Hadoop jar files.
- Copies Spark client files from the Hadoop Spark name node to the vworkers.
- Copies Spark client configuration from the Hadoop Spark name node to the vworkers.
- Creates four spark.config identifiers.
Hortonworks and Cloudera Spark can use all four spark.config identifiers in RunOnSpark queries.
- Installs RunOnSpark.zip.
- Tests the four spark.config identifiers on a schema that it creates and populates (and after the test, drops that schema).
- Gives Aster Database vworkers OpenSSH access to the Hadoop Spark name node.