Configure Automatic Deployment for Google Cloud Dataproc| QueryGrid - Automatically Deploy QueryGrid on Google Cloud Dataproc - Teradata QueryGrid

QueryGridâ„¢ Installation and User Guide - 3.06

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Teradata QueryGrid
Release Number
3.06
Published
December 2024
ft:locale
en-US
ft:lastEdition
2024-12-07
dita:mapPath
ndp1726122159943.ditamap
dita:ditavalPath
ft:empty
dita:id
lxg1591800469257
Product Category
Analytical Ecosystem

Dataproc provides initialization actions that can be used to install custom software on cluster instances. To create an initialization action, you need to provide a bootstrap script. The script must be stored in Google Cloud Storage on a URI that is accessible from the Dataproc cluster. For compatibility information on QueryGrid components, see the QueryGrid Compatibility Matrix.

The required script, TDQG_DEPLOYMENT.sh, is packaged in the node package tdqg-node-version.tar.gz.

This procedure assumes the following prerequisites:
  • You have required privileges to provision the Dataproc cluster and access scripts stored on Google Cloud Storage.
  • The cURL tool is installed on all nodes where you intend to install QueryGrid.
Note the following considerations with initialization actions:
  • can only be provided during cluster provisioning
  • cannot be modified post-cluster provisioning
  • are always persisted when created, all future Dataproc nodes run the initialization actions
  1. Add a system and download the tdqg-node.json token file that was generated by the QueryGrid Manager.
    For information about downloading tdqg-node.json, see Adding Nodes Manually.
  2. Do one of the following:
    Option Action
    Install QueryGridâ„¢ on Google Cloud Dataproc
    1. Download the node package.

      For more information, see Downloading Required Packages.

    2. Unzip the package:

      tar -xvzf tdqg-node-version.tar.gz

      The TDQG_DEPLOYMENT.sh script is available in the path qgdeployment/dataproc, named TDQG_DEPLOYMENT.sh.

    3. Upload the QueryGrid deployment script to Google Cloud Storage.
    4. In the Dataproc Create a Cluster screen, do the following:
      • At Initialization Actions, provide the path to the deployment script.
      • At Metadata, use tdqg_node_json as the key and use the contents of the file for the data.
    Install QueryGrid on an existing node Running the initialization actions script requires a user with sudo permissions.
    1. On each node in the cluster, run the following command:

      ./TDQG_DEPLOYMENT.sh --tdqg_node_json_file 'input'

      Where input can be one of the following:
      • (Recommended) Path to the tdqg_node_json_file.
      • File contents of tdqg_node_json.
    Install QueryGrid on a new node The initialization action on a new node depends on how you ran the TDQG_DEPLOYMENT.sh script when provisioning the Dataproc cluster.
    • If you ran the script as an Initialization Action to Dataproc, the script automatically runs on the new node.
    • If you did not run the script as an Initialization Action, run the script on the new node as if installing the script on an existing node.