17.00 - Privileges - Parallel Data Pump

Teradata® Parallel Data Pump Reference

prodname
Parallel Data Pump
vrm_release
17.00
created_date
June 2020
category
Programming Reference
featnum
B035-3021-220K

Teradata TPump users must have privileges on the database containing the log restart table because Teradata TPump orchestrates the creation of macros to use during the task.

Dropping the log table makes it impossible to restart a Teradata TPump job. Dropping the macros or the error table makes it very difficult to restart a Teradata TPump job.

Teradata TPump does not have any special protections on database objects it creates. Therefore, it is the responsibility of Teradata TPump administrators and users to ensure that privileges on databases used by Teradata TPump have been established.

Most of the privileges for Teradata TPump are intuitive. For example:
  • CREATE TABLE is required on the database where the log table is placed.
  • CREATE TABLE is required on the database where the error table is placed.
  • CREATE/DROP MACRO is required on the database where macros are placed.
  • EXECUTE MACRO is required on the database where the macros are placed.

Macros

The use of macros slightly complicates the privileges for Teradata TPump. The remaining privileges necessary to run a Teradata TPump job have two different scenarios:
  • When a Teradata TPump macro is placed in the same database as the table which it affects, required rights are INSERT/UPDATE/DELETE on the table affected, corresponding to the DML executed.
  • When a Teradata TPump macro is placed in a different database from the table it affects, required rights specify that the database where the macro is placed must have INSERT/UPDATE/DELETE, WITH GRANT in the table affected, corresponding to the DML executed. The EXECUTE MACRO must also be located on the database where the macro is placed.

Note that when the Teradata TPump job uses EXEC to directly specify a macro, the privileges scenarios are the same, except that the CREATE/DROP MACRO privilege is not required since the macro exists both before and after the job.

Tables

The corresponding INSERT, UPDATE, or DELETE privilege must exist for each table to be changed by the Teradata TPump task. Multiple tables can be targeted by a single Teradata TPump job.

The BEGIN LOAD command invokes Teradata TPump to execute task processing. Any statement of this task applies each matching imported data record to each of its target table rows having the specified index value. Teradata TPump supports all table types. Unlike MultiLoad, there are no forbidden index types. Thus, the tables may be either empty or populated, as well as being either with or without secondary indices.

All required data is imported; none is obtained from tables already existing in the database. No statement of an IMPORT task may make any reference to a table or row other than the one affected by the statement.

All INSERT statements, when considered in conjunction with each applicable imported record, must explicitly specify values for all columns except those for which a default value (including null) is defined. All UPDATE and DELETE statements, when considered in conjunction with each applicable imported record, must explicitly specify values for all columns of the primary index. In order to fulfill this requirement for UPDATE and DELETE statements, a series of ANDed terms of either form must be supplied:

column_reference = colon_variable_reference

or

column_reference = constant

Teradata TPump does not process UPDATE and DELETE statements that contain ORed terms because Teradata TPump must hash the imported records with a value from the import file (or with a NULL). Any attempt to use an OR with these statements causes Teradata TPump to fail. To work around this, create two separate DML statements and apply them conditionally.

Teradata TPump imposes some restrictions on the updates of an IMPORT task. It rejects updates that try to change the value of the primary index of a row, but accepts even reflexive updates of other columns. A reflexive update of a column computes the new value as the result of an expression that involves the current value of one or more columns.

Teradata TPump processes and validates all statements from the BEGIN LOAD through the END LOAD statements. Teradata TPump control and processing sessions are established and Teradata SQL requests are transmitted to the database. Teradata TPump creates a single error table and a set of macros, one for each DML statement. Nothing protects target tables from concurrent access.

Teradata TPump imports data, evaluating each record according to specified apply conditions. For each satisfied apply condition, a record is sent to the database. If the record causes an error, this sequence number is available in the error table so that the record can be identified.

When the task completes, all locks are released, all macros dropped and, if empty, the error table is dropped. Statistics concerning the outcome of the IMPORT task are reported.

Access logging can cause a severe performance penalty. If all successful table updates are logged, a log entry is made for each operation. The primary index of the access logging table may then create the possibility of row hash conflicts.