Security Guidelines | Hive Target Connector| QueryGrid - Hive Target Connector Security Guidelines - Teradata QueryGrid

QueryGrid™ Installation and User Guide - 3.06

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Teradata QueryGrid
Release Number
3.06
Published
December 2024
ft:locale
en-US
ft:lastEdition
2024-12-07
dita:mapPath
ndp1726122159943.ditamap
dita:ditavalPath
ft:empty
dita:id
lxg1591800469257
lifecycle
latest
Product Category
Analytical Ecosystem

General

When setting parameters for Hive target connectors in link pairings, confirm the Conf File Paths property contains the correct pathname. QueryGrid heavily depends on this setting when processing data transfers. See Hive Connector and Link Properties.

Kerberos

QueryGrid can use Kerberos authentication with a Hive target connector. There are two forms of authentication with Kerberos:

Username/Password
The Hive target connector authenticates the username and password against Kerberos before sending the query to the data source.
Username/Keytab
Hive is configured to enable Kerberos Keytab authentication.
When using a Hive target connector in an NVP link pairing to access a Kerberized Hadoop cluster:
  • Select Kerberos for the Authentication Mechanism property.
  • Set to HS2Only if only the HiveServer2 is secured (for example, LDAP/CUSTOM/PAM). This is not a common setup.
  • Verify the QueryGrid tdqg user has permission to run kinit. See Verifying Permission to Run kinit (Hive).
    When using Kerberos on CDH, you must set the Hive Kerberos Principal NVP to the correct Kerberos Principal for HiveServer2, not the Kerberos principal for the user connecting to HiveServer2. The setting must be in the primaryname/instancename@realmname format.

Trusted Kerberos

The Hive connector supports the authentication mechanism value Trusted Kerberos, which allows impersonation of the QueryGrid end-users with a single Kerberos service account. If the Kerberos service account has Hadoop-level privileges to impersonate a requested user, then QueryGrid queries run in the session as that requested user.

This feature is supported only with Kerberized Hadoop clusters and QueryGrid 2.13 and later. The Kerberos setup is the same as the Kerberos authentication setup, but with additional settings.

The impersonated user is passed to the target system using the JDBC property hive.server2.proxy.user. The impersonated user must be in Allowed OS users to run queries on the target system. There are two ways to define the impersonated user:
Options Description
User on the initiating system (Default) The Teradata system user is passed to the target system as the impersonated user with the username in all lowercase.
User Mapping User mapping can define the impersonated user with a case-sensitve username on the target system for any user on the initiating system. A typical use-case is when the impersonated username is not all lowercase.

When configuring user impersonation on HiveServer2, the service user must be able to impersonate the requested user. This requires configuring the properties hadoop.proxyuser.kerberos_service_user.groups and hadoop.proxyuser.kerberos_service_user.hosts in the core-site.xml file for the Kerberos service account. QueryGrid queries are triggered by JDBC from the driver node; therefore, enter the IP addresses of the driver nodes in the hadoop.proxyuser.kerberos_service_user.hosts property.

Avoid using wild cards (*) as the value for the hadoop.proxyuser.<krb service>.hosts and hadoop.proxyuser.<krb service>.groups properties. Apply the intended restrictions on the user groups that can be impersonated by different hosts.

Kerberos SSO

QueryGrid supports the Kerberos single sign-on (SSO) feature using the authentication mechanism, Kerberos SSO, in a Teradata-to-Hive link. You must complete the Kerberos set-up on the initiator and target systems before configuring QueryGrid. The Kerberos token is carried from the initiator to the target database when establishing a connection. Any Kerberos-related properties such as username, password/keytab, and realm are ignored if provided in the connector properties or authorization object.

The service user logon for the initiating Teradata system is used for running QueryGrid queries on Hive and therefore must be added to the Allowed OS users list.

See Kerberos Single Sign-On for more information.

Knox (HDP, CDP, and Dataproc Only)

If enabled, Knox is a security option that serves as a gateway service between Hive and HiveServer2 when configured in the the Hive connector properties. The Hive connector connects to the Knox service instead of directly to HiveServer2. Requests from the Hive connector are sent to the Knox service and Knox redirects the request to HiveServer2. Refer to Knox, Hortonworks Data Platform (HDP), CDP, or Dataproc documentation on how to properly configure Knox.

There is a limitation with Knox when SSL is enabled and Knox is connecting to HiveServer2 using SPNEGO authorization. In this scenario, Knox does not work with Hive.

When a Hortonworks Hadoop database is protected by Knox and you want the QueryGrid Hive connector to connect through Knox, make sure the following NVP link properties contain the correct values:

Setting Description
Knox Connection Password Password for the Knox connection. Only required when using Knox.
Knox Connection Username Username for the Knox connection. Only required when using Knox.
Knox Context Path Knox context path for HS2, for example, gateway/mycluster/hive. Only required when using Knox.
Knox Gateway Host Knox gateway host. The use of this property indicates that Knox is being used.
Knox Gateway Port Knox gateway port number. Valid port number values are 1024–65535.
Knox Trust Store Path Knox trust store path. Only required when using Knox.
Knox Trust Store Password Knox trust store password. Only required when using Knox.

For more information, see Hive Connector and Link Properties.

Ranger

QueryGrid is compatible with Ranger on any Hadoop distribution where Ranger is supported by the vendor.