PARTITION - Parallel Data Pump

Teradata Parallel Data Pump Reference

Product
Parallel Data Pump
Release Number
15.10
Language
English (United States)
Last Update
2018-10-07
dita:id
B035-3021
lifecycle
previous
Product Category
Teradata Tools and Utilities

PARTITION

Purpose  

The PARTITION command defines a collection of sessions used to transfer SQL requests to Teradata Database. A DML command may name the partition to be used for its requests to the database.

A default session partition may still be created using the SESSIONS and PACK parameters of the BEGIN LOAD command.

This command works in conjunction with a DML parameter, PARTITION, which names the session partition that a DML’s SQL will use. If the DML command does not have a PARTITION parameter, then the default partition created using the SESSIONS and PACK parameters of the BEGIN LOAD command will be used.

Syntax  

where

 

Syntax Element

Description

number

Number of sessions to be logged on for the partition

Teradata TPump logs on and uses the number of sessions specified to communicate requests to Teradata Database.

There is no default value for number; it must be specified. Neither is there a maximum value, except for systemwide session limitations, which vary among machines.

Limiting the number of sessions conserves resources on both the external system and Teradata Database. This conservation is at the expense of a potential decrease in throughput and increase in elapsed time.

DATAENCRYPTION ON/OFF

Keyword to encrypt import data and the request text during the communication between Teradata TPump and Teradata Database for the sessions defined in the PARTITION command

If ON, the encryption will be performed. If OFF, the encryption will not be performed. If DATAENCRYPTION is not specified, the default is OFF when "y" runtime parameter is not specified and DATAENCRYPTION is OFF in the BEGIN LOAD command. If "y" runtime parameter is specified or DATAENCRYPTION is ON in the BEGIN LOAD command, the default is ON.

This option applies to the sessions defined by the PARTITION command. When the session is specified explicitly, the setting overrides the encryption setting by the "y" runtime parameter and by the DATAENCRYPTION option in the BEGIN LOAD command for the sessions defined in the PARTITION command.

PACK

Keyword for the number of statements to pack into a multiplestatement request

Maximum value is 1500.

Packing improves network/channel efficiency by reducing the number of sends and receives between the application and Teradata Database.

PACKMAXIMUM

Keyword requesting Teradata TPump to dynamically determine the maximum possible PACK factor for the current partition

Maximum value is 1500.

Displayed in message UTY6652, the value thus determined should be specifically used on subsequent runs, as the use of PACKMAXIMUM requires iterative interactions with the database during initialization to heuristically determine the maximum possible PACK factor.

partition_name

Name assigned to the partition for reference by one or more subsequent DML commands

A partition name must obey the same rules for its construction as Teradata SQL column names. The name specified may be used in the PARTITION clause of a DML command.

SESSIONS

Keyword for designating the number of sessions for the partition

statements

Number of statements, as a positive integer of up to 1500, to pack into a multiplestatement request

Default value is 20 statements per request.

Note: Under certain conditions, Teradata TPump may determine that the pack factor has been set too high. Teradata TPump then automatically lowers the pack setting to an appropriate value and issues warning message UTY6625, for example:

“UTY6625 WARNING: Packing has been changed to 12 statements per request”, and continues.

Packing improves network/channel efficiency by reducing the number of sends/receives between the application and the database.

The packing factor is validated by sending a fully packed request to Teradata Database using a prepare. This test checks for syntax problems and requests that are excessively large and overwhelm the parser.

To simplify the script development process, Teradata TPump ignores certain errors returned by an overloaded parser, shrinks the request, retries the prepare until it executes successfully and finally, issues a warning noting the revised packing factor size.

When this happens, the Teradata TPump script should be modified to eliminate the warning, which avoids the timeconsuming process of shrinking the request.

Note: A packing failure may occur if the source parcel length does not match the data defined. If this happens, Teradata TPump issues the message:

“UTY2819 WARNING: Packing may fail because input data does not match with the data defined.”

To resolve this problem, increase the packing factor and resubmit the job.

threshold

Minimum number of sessions to be logged on for the partition

When logging on sessions, if system limits are reached above the threshold value, Teradata TPump stops trying to log on, and uses whatever sessions are already logged on.

If the sessions run out before the threshold is reached, Teradata TPump logs off all sessions, waits for the time determined by the SLEEP value (specified in the BEGIN LOAD command), and tries to log on again.

Example  

A sample script that uses partitioning follows:

.LOGTABLE TPLOG01;
.LOGON <TDPID>/<USERID>,<PASSWORD>;
 DROP TABLE TPTBL01;
 DROP TABLE TPTBL02;
 DROP TABLE TPERR01;
 
 CREATE TABLE TPTBL01, FALLBACK(
  C1  CHAR(12) not null,
  C2  CHAR(8) not null)
  PRIMARY INDEX (C1);
 
 CREATE TABLE TPTBL02, FALLBACK(
  C1  CHAR(12),
  C2  CHAR(8),
  C3  CHAR(6))
  UNIQUE PRIMARY INDEX (C1);
 
.BEGIN LOAD
   ERRLIMIT 100 50
   CHECKPOINT 15
   TENACITY 2
   ERRORTABLE TPERR01
   ROBUST off
   serialize on
   ;
 
 .LAYOUT LAY02;
 .FIELD cc1 * CHAR(12) key;
 .FIELD cc2 * CHAR(8);
 .FIELD cc3 * CHAR(6);
 .filler space1 * char(1);
 
.partition part1  pack 10 sessions 10;
.partition part2 sessions 5 1 packmaximum;
 
.DML LABEL LABEL01 partition part1
 DO INSERT FOR MISSING ROWS
 ignore extra update rows
   use(cc1, cc2);
 
 UPDATE TPTBL01
 SET C2 = :CC2
 WHERE C1 = :CC1;
 INSERT TPTBL01 (C1, C2)
 VALUES (:CC1,:CC2);
 
.DML LABEL LABEL02 partition part2
 serializeon( cc1 )
 ignore extra update rows
 DO INSERT FOR MISSING UPDATE ROWS;
 
 UPDATE TPTBL02 SET C2 = :CC2 WHERE C1 = :CC1;
 INSERT TPTBL02 (C1, C2, C3)
          VALUES (:CC1,:CC2,:CC3);
 
.IMPORT INFILE TpumpData001.txt FORMAT TEXT
               LAYOUT LAY02
               APPLY LABEL01
               APPLY LABEL02 where CC2 = '00000001';
 
.END LOAD;
.LOGOFF;