2.06 - Spark SQL Connector Limitations - Teradata QueryGrid

Teradata® QueryGrid™ Installation and User Guide

Product
Teradata QueryGrid
Release Number
2.06
Published
September 2018
Language
English (United States)
Last Update
2018-11-26
dita:mapPath
blo1527621308305.ditamap
dita:ditavalPath
ft:empty
The following limitations affect use of Spark connectors with Teradata QueryGrid:
  • The Spark connector does not support ACID tables or transactional tables.
  • Transaction semantics between systems is not supported.

    After data has been exported and committed to a remote system, any subsequent errors or aborts on the local system do not roll back the remote request.

  • When using the Explain command with a Spark initiator connector, the remote server does not return query and execution plan data.
  • The default for Timestamp precision is nine (9); Teradata QueryGrid truncates data with more than six decimal places when using Spark-to-Teradata links.
  • Only limited predicate pushdown is available.
  • The Foreign Function Execution (FFE) feature is currently not supported for the Spark SQL target connector.
  • The Spark SQL connector does not support roles since roles are not supported by the Spark Thrift Server.
  • By default, the Spark SQL target connector returns a 1 as the number of rows exported regardless of how many actual rows were exported during a successful export query. Setting the Collect Approximate Activity Count connector property to true returns the number of rows exported with a slight performance overhead. If there are concurrent inserts on the Spark SQL table, an inaccurate number of rows might be displayed, resulting in an approximate result rather than a precise number.
  • When starting either the Spark Thrift Server or Spark Shell to use with the Spark Connector, Teradata recommends setting the spark.task.maxFailures property to 1.
  • The following are a result of possible Apache Spark limitations:
    • Spark 2.1 and later: When using the Spark initiator, if the schema of a target table changes after a non-native table representing that target table has been created, that non-native table must be recreated in order to reflect the schema change.
    • Spark 2.2 and later: When importing data for the DATE type using the Spark target connector or exporting data of the DATE type using the Spark initiator, the data value from Spark can be incorrect.
    • Spark 2.2 and later: When using the Spark target connector, if a table is created using Hive and the table contains certain combinations of columns (the exact list of all combinations is not available, although a typical example is several STRING columns), data inserted into that table from QueryGrid can be incorrect. This issue does not occur when creating the table directly using Spark instead of Hive.