15.10 - System Resource Errors - Parallel Transporter

Teradata Parallel Transporter User Guide

prodname
Parallel Transporter
vrm_release
15.10
category
User Guide
featnum
B035-2445-035K

System Resource Errors

The following section presents common system resource errors.

Error Case 7: Insufficient Semaphores

Teradata PT console message:

Teradata Parallel Transporter Version <version>
Execution Plan generation started
Execution Plan generation successfully completed
Job log: /opt/Teradata/Client/<version>/logs/udd014-18.out
OS_SemInit: semget() failed, System erno: 28 (No space left on device)
1008: Failed to Initialize necessary IPC resources to run this job
1155: Infrastructure for the Parallel Task Manager failed
1006: Failed to set up Parallel Task Manager infrastructure to run job

Cause:

Error 28, ENOSPC, on segment() indicates that the system limit on the maximum number of semaphores would be exceeded if the semget() request was honored.

Corrective Actions:

1 Use the ipcs command to check the computer from which the Teradata PT job was launched to see if there are semaphores that have been orphaned.

2 Use the ipcm command to free up any unused semaphores that may be available.

3 Use the sysdef command to find out the number of semaphores (SEMMNS) defined on the system. Increase the number and then reboot the system.

4 Re-launch the job.

Error Case 8: Insufficient Semaphore Undo Structures

Teradata PT console message:

Teradata Parallel Transporter Version <version>
Execution Plan generation started.
Execution Plan generation successfully completed.
Job log: /opt/Teradata/Client/<version>/logs/root-2.out
OS_SemOp: semop() failed. System errno: 28 (No space left on device)
OS_AllocSem: OS_SemOp failed

Cause:

Error 28, ENOSPC, on a semop() means that the system has run out of undo structures for semaphores.

Corrective Action:

1 Use the sysdef command or something similar to find out the value of the semaphore undo structures (SEMMNU) defined on the system.

2 Increase the value of SEMMNU.

3 Reboot the system.

4 Re-launch the job.

Error Case 9: Socket Handle Error

Teradata PT console message:

Teradata Parallel Transporter Version <version>
Execution Plan generation started.
Execution Plan generation successfully completed.
Job id is load_dpforecast-1, running on WUSSL185013-V02
Job log: C:\Program Files\Teradata\Client\<version>\Teradata Parallel Transporter\logs/load_dpforecast-1.out
1405: Error occured while polling for any ready socket, System errno: 10038 
(An operation was attempted on something that is not a socket.) 
PX_Node::Bind() [Node WUSSL185013-V02] - Failed with status 15
1113: Failed to read 8 bytes from socket 3872, System errno: 10054 
(An existing connection was forcibly closed by the remote host.)
1141: Failed to receive config response from the Job Logger
WUSSL185013-V02 - PTM status 15: the Job Logger facility could not be set up

Cause:

On some Windows XP machines the socket handle is not inherited correctly by Teradata PT, preventing the setup of job logging. This problem hasn't been found on UNIX or z/OS platforms.

Corrective Actions: Teradata PT Efix available.

Error Case 10: Insufficient Allocation of Shared Memory

Teradata PT console message:

Teradata Parallel Transporter Version <version>
Execution Plan generation started.
Execution Plan generation successfully completed.
Job log: /opt/Teradata/Client/<version>/tbuild/logs/root-2.out
OS_ShmInit: shmget(1048576) failed, System errno: 22 (Invalid argument)
1008:Failed to Initialize necessary IPC resources to run this job
1155: Infrastructure setup for the Parallel Task Manager failed
1006:Failed to setup Parallel Task Manager Infrastructure to run this job

Cause:

Teradata PT requested one meg (1024*1024) of shared memory (the minimum). The OS returned EINVAL, meaning that the requested size is less than SHMMIN, greater than SHMMAX, or greater than the size of any available segment.

Corrective Actions:

1 Use the sysdef command or something similar to find out the values of the shared memory parameters, SHMMIN, SHMMAX, and SHMSEG, defined on the system.

2 Use the sysdef command or something similar to find out the values of the shared memory parameters, SHMMIN, SHMMAX, and SHMSEG, defined on the system.

3 Decrease the value of SHMMIN, or increase the values for SHMMAX and SHMSEG, as required to provide adequate shared memory.

4 Reboot the system.

5 Re-launch the job.

Error Case 11: Shared Memory Overflow Due to Excessive Operator Instances

Teradata PT console message:

Teradata Parallel Transporter Version <version>
Execution Plan generation started.
Execution Plan generation successfully completed.
Job log: /opt/Teradata/Client/<version>/tbuild/logs/infomatc-66241.out
Job id is load_files-66241, running on system02-ib
Teradata Parallel Transporter DataConnector Version 08.02.00.01
Teradata Parallel Transporter Stream Operator Version 08.02.00.00
READ_DATA: Operator instance 1 processing file 'File00006'.
READ_DATA: Operator instance 1 processing file 'File00001'.
READ_DATA: Operator instance 1 processing file 'File00003'.
READ_DATA: Operator instance 1 processing file 'File00005'.
READ_DATA: Operator instance 1 processing file 'File00016'.
READ_DATA: Operator instance 1 processing file 'File00011'.
READ_DATA: Operator instance 1 processing file 'File00015'.
READ_DATA: Operator instance 1 processing file 'File00007'.
READ_DATA: Operator instance 1 processing file 'File00008'.
READ_DATA: Operator instance 1 processing file 'File00013'.
READ_DATA: Operator instance 1 processing file 'File00009'.
READ_DATA: Operator instance 1 processing file 'File00014'.
READ_DATA: Operator instance 1 processing file 'File00012'.
READ_DATA: Operator instance 1 processing file 'File00002'.
READ_DATA: Operator instance 1 processing file 'File00010'.
READ_DATA: Operator instance 1 processing file 'File00004'.
STREAM_OPERATOR: connecting sessions
PXTB_AllocateMessage: Cannot create data buffer, Data Stream status = 3
1104: Insufficient main storage for attempted allocation

Cause:

Data moves from the producer operator instances to the consumer operator instances in data streams. Teradata PT allows allocation of up to 10MB of shared memory for use in servicing data streams, which imposes a limit of approximately 75 data streams for a job. When this limit is exceeded, the job can no longer allocate more buffers in the Data Stream, which causes the job to terminate.

For more detailed information about the relationship between instance usage and shared memory “Determining the Use of Shared Memory” on page 263.

Corrective Action:

1 Do one of the following

  • Decrease the number of consumer or producer instances.
  • or,

  • Use the tbuild -h option to increase the shared memory size for the job. For details, see the following section on “Allocating Shared Memory” on page 181.
  • 2 Relaunch the job.

    For required syntax and a description of tbuild -h, see “tbuild,” in the Teradata Parallel Transporter Reference.

    Allocating Shared Memory

    By default, Teradata PT provides 10MB of shared memory for the execution of a job script. The tbuild -h option allows you to adjust the shared memory to more accurately reflect the needs of the job, as follows:

  • Use -h value to specify a value in bytes ranging from 1,048,576 (that is, 1 MB) to 134,217,728 (that is, 128 MB).
  • Use -h valueK to specify a value in kilobytes ranging from 1024 K (that is, 1,048,576 bytes) to 131,072 K (that is, 134,217,728 bytes).
  • Use -h valueM to specify a value in megabytes ranging from 1 MB (that is, 1,048,576 bytes) to 128 MB (that is, 134,217,728 bytes).
  • For information on how to calculate shared memory usage, see “Determining the Use of Shared Memory” on page 263.

    Error Case 12: Log File is Full

    Teradata PT console message:

    Teradata Parallel Transporter Version <version>
    Execution Plan generation started.
    Execution Plan generation successfully completed.
    Job log: /opt/Teradata/Client/<version>/tbuild/logs/root-2.out
    1403: Unable to Write data to the file, System errno: 113 
    (EDC5113I Bad file descriptor CEE5213S The signal SIGPIPE was received.)

    Cause:

    The 113 error occurs because the log file is full. The job directory has run out of disk space.

    Corrective Action:

    Delete unused log files from the directory.