Teradata Database disk space is divided into cylinders, with each cylinder being made up of a number of data blocks, each of which contains one or more data rows. The database allocates cylinders as follows:
|Contiguous sectors on a cylinder||datablocks are stored on adjacent sectors in a cylinder.
If a cylinder has 20 available sectors, but only 10 are contiguous, a 15-sector block must be stored on another cylinder.
|Free cylinders||Teradata Database performs better if permanent data is distributed across multiple cylinders. However, permanent data and spool data cannot share the same cylinder.
Therefore, a system must always have empty cylinders that can be used for spool space.
Teradata Database does not run out of disk space until it allocates and fully utilizes all cylinders.
Low Cylinder Utilization
Performance degradations can occur, however, as soon as the system gets close to exhausting the free cylinder pool. This happens because the system performs MiniCylPacks on cylinders with low utilization in order to reclaim the unused disk space. Therefore, you should be aware if you are running out of space due to a preponderance of under-utilized cylinders.
Low utilization of cylinders can occur when:
- You FastLoad a table using a small FreeSpacePercent (FSP) and then insert additional data to the table that is greater than the FSP.
- You delete a significant percent of a table but have not yet run Ferret PACKDISK to reclaim the space.
Frequently Updated Tables
With frequently updated tables, the free space on the cylinder can become so fragmented that it cannot be used.
When this occurs, the system could allocate additional cylinders to the table. To avoid this problem, the system sometimes performs a cylinder defragmentation to make the free space on the cylinder usable again.
AutoCylPack (automatic background cylinder packing) manages space on cylinders, finding cylinders with low or high utilization and returning them to their user-defined FSP (Free Space Percent), that is, the percentage of storage space within a cylinder that AutoCylPack leaves free of data to allow for future table growth.
There is a system wide default for AutoCylPack. It can be overridden on a table-by-table basis by specifying the FSP clause in a CREATE TABLE or ALTER TABLE statement.
It is important for Teradata Database file system to maintain enough free cylinders and to manage the space on the cylinders.
Although AutoCylPack runs as a background task issuing I/Os, you can adjust its level of impact. For the DBS Control fields that support AutoCylPack, see Utilities.
AutoCylPack helps reduce:
- The chances that MiniCylPacks run. MiniCylPacks are strictly concerned with reclaiming whole cylinders.
- The need for you to perform regular Ferret PACKDISK operations.
PACKDISK can be performed on a table, a range of tables or an entire system, but it affects performance of the foreground tasks and thus system performance.
If cylinders have higher utilization:
- System performance improves because there is a higher data block to cylinder index ratio and more effective use of Cylinder Read. There will be less unoccupied sectors read in.
- A table occupies less cylinders. This leaves more cylinders available for other uses.
A MiniCylPack moves data blocks in logical sequence from cylinder to cylinder, stopping when the required number of free cylinders is available. A single MiniCylPack may affect two to 20 cylinders on an AMP.
The process continues until one cylinder is completely emptied. The master index begins the next required MiniCylPack at the location that the last MiniCylPack completed.
Teradata Database file system will start to MiniCylPack when the number of free cylinders drops to the value set by MiniCylPackLowCylProd. The default is 10.
The File Information Block (FIB) keeps a history of the last five cylinders allocated to avoid MiniCylPacks on them.
Use the DBS Control (see “MiniCylPackLowCylProd” in Utilities) to specify the free cylinder threshold that causes a MiniCylPack. If the system needs a free cylinder and none are available, a MiniCylPack occurs spontaneously.
Migrating Data Blocks
- If space can be made available either by migrating blocks forward to the next cylinder or backwards to the previous cylinder, choose the direction that would require moving the fewest blocks.
If the number of blocks is the same, choose the direction of the cylinder with the most number of free sectors.
- If Step 1 fails to free the desired sectors, try migrating blocks in the other direction.
- If space can be made available only by allocating a new cylinder, allocate a new cylinder. The preference is to add a new cylinder:
- Before the current cylinder for permanent tables.
- After the current cylinder for spool tables and while performing FastLoads.
When migrating either forward or backward, the number of blocks may vary because the system considers different blocks for migration.
Because of the restriction on key ranges within a cylinder, the system, when migrating backward, must move tables and rows with the lowest keys. When migrating forward, the system must move tables and rows with the largest keys.
The system follows special rules for migrating blocks between cylinders to cover special uses, such as sort and restore. There are minor variations of these special rules, such as migrating more data blocks than required in anticipation of additional needs, and looking for subtable breaks on a cylinder to decide how many data blocks to attempt to migrate.
Although cylinder packing itself has a small impact on performance, it often coincides with other performance impacting conditions or events. When Teradata Database file system performs a MiniCylPack, the operation frees exactly one cylinder.
The cylinder packing operation itself runs at the priority of the user whose job needed the free cylinder. The cylinder packing operation is the last step the system can take to recover space in order to perform a write operation, and it is a signal that the system is out of space.
Needing to pack cylinders may be a temporary condition in that a query, or group of queries, with very high spool usage consumes all available free space. This is not a desirable condition.
If space is a problem,
- Enable AutoCylPack if you have not done so and specify an FSP of 0% for read-only tables using the CREATE TABLE or ALTER TABLE statement
- Run the Ferret PACKDISK command.
MiniCylPacks are a natural occurrence and serve as a warning that the system may be running short on space. Tightly packed data can encourage future cylinder allocation, which in turn triggers more MiniCylPacks.
The system logs MiniCylPacks in the Software_Event_LogV with the following error codes.
|340514100||Summary of MiniCylPacks done at threshold set via the DBS Control.|
|340514200||A MiniCylPack occurred during processing and a task was waiting for it to complete.|
|340514300||The system could not free cylinders using MiniCylPack. The MiniCylPack failed. This means that the system is either getting too full or that the free cylinder threshold is set unreasonably high. Investigate this error code immediately.|
Frequent 340514200 or 340514300 messages indicate that the configuration is under stress, often from large spool file requirements on all AMPs. MiniCylPacks tend to occur across all AMPs until spool requirements subside. This impacts all running requests.
If table data is skewed, you might see MiniCylPacks even if Teradata Database has not used up most of the disk space.
As random updates occur over time, empty gaps become scattered between data blocks on the cylinder. This is known as fragmentation. When a cylinder is fragmented, total free space may be sufficient for future updates, but the cylinder may not contain enough contiguous sectors to store a particular data block. This can cause cylinder migrates and even new cylinder allocations when new cylinders may be in short supply. To alleviate this problem, the file system defragments a fragmented cylinder, which collects all free space into contiguous sectors.
Use the DBS Control (see “DefragLowCylProd” in Utilities) to specify the free cylinder threshold that causes defragmentation. When the system reaches this free cylinder threshold, it defragments cylinders as a background task.
To defragment a cylinder, the file system allocates a new cylinder and copies data from the fragmented cylinder to the new one. The old cylinder eventually becomes free, resulting in a defragmented cylinder with no change in the number of available free cylinders.
Since the copy is done in order, this results in the new cylinder having a single, free-sector entry that describes all the free sectors on the cylinder. New sector requests on this cylinder are completed successfully, whereas before they may have failed.
The Ferret Defrag command is a defragmentation command. It can be used to scan the entire set of user cylinders and then to defragment the qualified cylinders.
Running Ferret Defrag command, however on a busy customer system affects the performance of the foreground tasks and thus system performance. Moreover, Ferret Defrag has no way to know if there are enough free cylinders already available in the system, so that no more defragmentation is required.
While AutoCylPacks, MiniCylPacks and defragmentation help the system reclaim free disk space for further use, they incur a performance degradation. Properly size and tune the system to avoid this overhead.