The supported client for interacting with the Spark SQL initiator is the Scala Read-Eval-Print-Loop (REPL), referred to as the spark-shell. In order to use the Spark SQL initiator, spark-shell must be started using the following JAR file:
- spark-loaderfactory
- Log on to the node you want to start spark-shell.
- Locate the connector path at /opt/teradata/tdqg/connector/tdqg-spark-connector/<version>/lib/)
- Add the JAR file.
- Start the spark-shell.The following is an example path for starting spark-shell:
spark-shell --jars /opt/teradata/tdqg/connector/tdqg-spark-connector/ version/lib/spark-loaderfactory-version.jar --master yarn
When using CDH clusters, use the spark2-shell command name instead of spark-shell.When using a cluster that has Scala 2.12, such as Dataproc 1.5 or later and EMR 6.12, use spark-loaderfactory-scala212 as shown in the following example:spark-shell --jars /opt/teradata/tdqg/connector/tdqg-spark-connector/version/lib/spark-loaderfactory-scala212-version.jar --master yarn