EMR provides bootstrap actions that can be used to install custom software on cluster instances. To create a bootstrap action, you need to install a bootstrap script and input information into the script. The script must be stored on a URI that is accessible from the EMR cluster. For compatibility information on QueryGrid components, see the Teradata QueryGrid Compatibility Matrix.
The required script, TDQG_DEPLOYMENT.sh, is packaged in the node package tdqg-node-version.tar.gz.
- You have required privileges to provision the EMR cluster and scripts stored at a location accessible from the EMR cluster.
- The cURL tool is installed on all nodes where you intend to install QueryGrid.
- can only be provided during cluster provisioning
- cannot be modified post-cluster provisioning
- are always persisted when created, all future EMR nodes run the bootstrap actions
- Add a system and download the tdqg-node.json token file that was generated by the QueryGrid Manager to the Teradata QueryGrid service.For information about downloading tdqg-node.json, see Adding Nodes Manually.
- Do one of the following:
Option Action Install Teradata QueryGrid on AWS EMR - Download the node package.
For more information, see Downloading Required Packages.
- Unzip the package:
tar -xvzf tdqg-node-version.tar.gz
The TDQG_DEPLOYMENT.sh script is available in the path qgdeployment/emr, named TDQG_DEPLOYMENT.sh.
- Upload the QueryGrid deployment script to AWS S3 (or to any path accessible from the EMR cluster).
- [Optional] To add the deployment script as a custom bootstrap action while creating the EMR cluster, select Advanced Options.
- At General Cluster Settings, enter the cluster name and enable logging.
- Go to Configure. and select
- Provide the following parameters in the Add Bootstrap Action window:
- Name: Enter the name of the bootstrap action; such as, QueryGrid Deployment
- Script location: Path for TDQG_DEPLOYMENT.sh file as mentioned in step c above. For example:
s3://path_to_s3_folder/TDQG_DEPLOYMENT.sh
- Optional arguments: Can be provided as in the following example:
--tdqg_node_json_file s3://<path_to_s3_folder>/tdqg-node.json
or as plain text as in the following example:--tdqg_node_json_file <tdqg_node_json as text>
EMR does not allow double quotes in text input. If providing a text file, escape the double quotes. Since bootstrap actions cannot be updated after the cluster is provisioned, Teradata recommends providing the file as a path instead of text.Example using JSON in plain text:
--tdqg_node_json_file "{\"systemId\":\"c2f3d9e2-0bb1-4707-aa82-847a5ca94735\",\"manager" ...igId\":\"d6613e25-3c9d-479a-b8a7-57aae994c826\"}"
- Select Add and finish the Wizard.
For more information, see https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-bootstrap.html
Install QueryGrid on an existing node Running the bootstrap actions script requires a user with sudo permissions. - Run TDQG_DEPLOYMENT.sh on each node in the cluster.
Make sure the cluster is not currently active before running.
- Type ./TDQG_DEPLOYMENT.sh -tdqg_node_json_file input, where input can be one of the following:
- Path to tdqg_node_json_file (Teradata recommended method)
- Complete tdqg_node_json file contentEMR does not allow double quotes in text input. If providing a text file, escape the double quotes. Since bootstrap actions cannot be updated after the cluster is provisioned, Teradata recommends providing the file as a path instead of text.
Install QueryGrid on a new node The bootstrap action on a new node depends on how you ran the TDQG_DEPLOYMENT.sh script when provisioning the EMR cluster. - If you ran the script as custom bootstrap action, the script automatically runs on the new node.
- If you did not run the script as a custom bootstrap action, run the script on the new node as if installing the script on an existing node.
- Download the node package.