2.06 - Spark SQL Connector and Link Properties - Teradata QueryGrid

Teradata® QueryGrid™ Installation and User Guide

Product
Teradata QueryGrid
Release Number
2.06
Published
September 2018
Language
English (United States)
Last Update
2018-11-26
dita:mapPath
blo1527621308305.ditamap
dita:ditavalPath
ft:empty
When you create links and associated properties in the QueryGrid portlet, you are creating Configuration Name Value Pairs (NVP). NVP does the following:
  • Specifies the behavior of the target connector component
  • Configures how data is transformed
  • Configures the underlying link data transportation layer
  • Affects how the initiator connector performs

Links are named configurations that include an initiating connector and a target connector. If the same property is set for a link and a connector, the link setting overrides the connector setting.

Properties may be available for initiating connectors only, target connectors only, or both.
Name Default Description Overridable?

Property Name

Connector Type
16.20+ LOB Support True On Teradata Databases version 16.20 and later, the STRING and BINARY columns on Spark SQL are mapped to CLOB and BLOB by default. Unselect this option to map the STRING and BINARY columns to VARCHAR and VARBYTE, respectively.

lobSupport

Target
Auth Password None Password of the user or service account.   Target
Auth User Name None Name of the user or service account.   Target
Collect Approximate Activity Count False Displays the approximate number of rows exported to the target data source.

When set to false, the activity count displays a 1. When set to true, an approximate activity count is returned. The default is false.

collectActivityCount

Target
Compression Codec System Default Compression type to use when exporting to a Spark target table. Valid values are System Default, Deflate, BZip2, GZip, LZ4, and Snappy.

compressionCodec

Target
Conf File Paths /etc/hadoop/conf/,

/etc/spark2/conf

Paths to core-site.xml, hdfs-site.xml, yarn-site.xml, and hive-site.xml in a comma-separated list.

This is a required setting.

  Target
Connection Evict Frequency 30 minutes

Frequency of eviction checks. Connection objects from the pool are checked, closed, and removed if the idle time (current time - last time of use) of a connection object is greater than the Connection Max Idle Time setting.

Reduce the time between checks if there are multiple concurrent users running queries to clear the connections more frequently.

Valid values are 1–1440 minutes.

  Target
Connection Max Idle Time 86400 seconds

The maximum idle time for the connection cache object, after which the object is closed and removed from the cache. Use this property when there are multiple concurrent users and queries running on the system that might lead to starvation of connection objects.

Valid values are 1–86400 seconds.

  Target
Connection Pool Size 100

Maximum number of connection objects that can be stored in a connection pool. When acquiring a new connection, the connector checks for an available space in the pool. If no space is available in the connection pool, the connection fails after 5 minutes. Only one connection pool and username per connector configuration is allowed.

Valid values are 1–10000.

  Target
Database Name Default Name of the database for the connector, if not provided in the user query.

Maximum name length is 255 characters.

databaseName

Target
Default Binary Size 64000 bytes The default truncation size for the VARBINARY types.

Valid values are 1–2097088000 bytes.

This is for a Teradata-to-Spark link and is used by the target Spark connector and is applicable when the initiating Teradata Database does not support BLOB data types with QueryGrid. With BLOB support, the default binary size is not used.

defaultBinarySize

Target
Default String Size 32000 characters The VARCHAR truncation size. This is the size at which data imported from or exported to string columns is truncated. The value represents the maximum number of Unicode characters to import, and defaults to 32000 characters. Teradata QueryGrid truncates the string columns at the default value set in defaultStringSize.

Valid values are 1–1048544000 characters.

This is for a Teradata-to-Spark link and is used by the target Spark connector and is applicable when the initiating Teradata Database does not support CLOB data types with QueryGrid. With CLOB support, the default string size is not used.

defaultStringSize

Target
Disable Pushdown False When set to true, disables the pushdown of all query conditions to the target system.

Certain system-level, session-level, and column-level query attributes, such as CASESPECIFIC, can affect character string comparison results. These attributes can cause some queries to return incorrect results due to incorrect row filtering on the target system.

To avoid incorrect results caused by condition pushdown in situations where the settings on the initiating system do not match the settings on the target system, you can disable the pushdown of all conditions to the target system.

If designated as Overridable, this property can only be overridden at the session level from false to true (indicating you are disabling pushdown), but cannot be changed from true to false.

disablePushdown

Initiator
Enable Logging None Runs queries with debugging mode enabled.

Valid values are NONE, WARN, INFO, DEBUG, and VERBOSE.

  Initiator, Target
Hadoop Library Path Default Hadoop library path Required if Hadoop uses a custom installation path instead of the default Hadoop path or if any Hadoop .jar files are saved outside of the default Hadoop library. Enter paths in a comma-separated list. See Configuring the Hive Connector for Use with a Custom Hadoop Library Path or Custom JAR Path

If no custom information is available, the default Hadoop library path is used.

  Target
Hadoop Properties None Specifies Hadoop environment properties for a user session. Properties are provided in a list. Use = between each property and its value (name=value, name=value, name=value), and a comma as a separator between properties, with or without a space after the comma.

For example:

mapred.job.queue.name=abcdef,mapreduce.task.timeout=3600000,mapreduce.map.speculative=false

If Hadoop Properties is not selected, the default Hadoop environment properties are used.

hadoopProperties

Target
Keytab None Absolute path to the Kerberos keytab file. QueryGrid only uses the keytab file for authentication if a username and password is not provided.   Target
Link Buffer Count 4 Maximum number of write buffers available on a single channel at one time.
Link Buffer Count overrides the default internal fabric property shmDefaultNumMemoryBuffers.

Valid values are 2–16.

linkBufferCount

Initiator, Target
Link Buffer Size 1048576 Maximum size of the write buffers to allocate for row handling and message exchange.

Valid values are 73728–10240000 bytes.

linkBufferSize

Initiator, Target
Link Handshake Timeout 30000 Handshake and ACK timeout in milliseconds for the shared memory channel setup.

Valid values are 5000–300000.

  Initiator, Target
Link Heartbeat Interval 60000 Maximum interval in milliseconds for the heartbeat signal on a channel between the connector and the fabric instance, used for health check status. Tunable for diagnostic purposes only.
This interval should be greater than Link Handshake Timeout.

Valid values are 5000–300000.

  Initiator, Target
Number Executors 2 Unit of parallelism when data is exported or imported into Spark SQL.

numExecutors

Initiator, Target
Port 10016 Valid values for the Spark Connector are 1026–65535.   Target
Read Timeout 300000 Number of milliseconds to wait to read between data packets when importing data messages.

Valid values are 1000 milliseconds – 604,799,000 milliseconds. The maximum value is derived from the equation: 7 days * 24 hours * 60 minutes * 60 seconds * 1000 milliseconds.

readTimeout

Initiator, Target
Response Timeout 1800000 Number of milliseconds to wait for the final data exec response when all the data has been transferred.

Valid values are 1000 milliseconds – 604,799,000 milliseconds. The maximum value is derived from the equation: 7 days * 24 hours * 60 minutes * 60 seconds * 1000 milliseconds.

responseTimeout

Initiator, Target
Security None Overall security mechanism for the cluster.   Target
Server None Used to connect to the target database as part of the JDBC connection string. This is the IP address or DNS name of the target host.   Target
Temporary Database Name Default Temporary database name for storing temporary tables and views.

tempDbName

Target
Username Hive Name of the user. A username added for a connector or target connector link must be included in Allowed OS users.

Maximum length is 255 characters.

This NVP is saved in the Teradata QueryGrid Manager configuration and is required when the initiator does not support a mechanism to provide user credentials. The username is also used for connectivity diagnostic checks.

  Target
Write Timeout 300000 Number of milliseconds to wait to write between data packets when exporting data messages.

Valid values are 1000 milliseconds – 604,799,000 milliseconds. The maximum value is derived from the equation: 7 days * 24 hours * 60 minutes * 60 seconds * 1000 milliseconds.

writeTimeout

Initiator, Target