The supported client for interacting with the Spark SQL initiator is the Scala Read-Eval-Print-Loop (REPL), referred to as the spark-shell. In order to use the Spark SQL initiator, spark-shell must be started using the following JAR files:
- spark-loaderfactory
- log4j-api
- log4j-core
- Log on to the node you want to start spark-shell.
- Locate the connector path at /opt/teradata/tdqg/connector/tdqg-spark-connector/<version>/lib/)
- Add the JAR files.
-
Start the spark-shell.
The following is an example path for starting spark-shell:
spark-shell --jars /opt/teradata/tdqg/connector/tdqg-spark-connector/version/lib/spark-loaderfactory-version.jar,/opt/teradata/tdqg/connector/tdqg-spark-connector/version/lib/log4j-api-2.7.jar,/opt/teradata/tdqg/connector/tdqg-spark-connector/version/lib/log4j-core-2.7.jar --master yarn
On CDH clusters, only Spark 2.1 and later are supported. When using CDH clusters, use the spark2-shell command name instead of spark-shell.