Global settings for data block size are controlled using DBS Control utility settings:
You can also adjust the data block size for individual tables using the DATABLOCKSIZE and MERGEBLOCKRATIO options in CREATE TABLE or ALTER TABLE statements.
PermDBAllocUnit and System Performance
As tables are modified, rows are added, deleted, and changed. Data blocks grow and shrink dynamically to accommodate their current contents. However, data block sizes can change only in units of PermDBAllocUnit. This means there will nearly always be some unused space left at the end of the data block. If table modifications are relatively even, such incremental changes in data block size result in an average of approximately half an allocation unit of space wasted for every data block. (This is a rough approximation, and will depend on many factors that differ from database to database.)
In environments where new rows are added frequently to tables, or where tables with variable length rows are frequently growing, system performance might be improved slightly by increasing the allocation unit size. With a larger allocation unit, data blocks will not need to be enlarged as frequently, because there will already be room for additional changes. However, in environments where new rows are not added frequently, the additional space in each block can degrade performance by increasing the average I/O size.
Make only small changes to this setting at a time, and carefully evaluate the results before committing the change on a production system. Set the allocation unit to a multiple of the average row size of tables that change frequently, rounded up to the nearest sector.
Because the benefit of larger allocation units is often offset by the consequent increase in average wasted space, Teradata recommends that PermDBAllocUnit be left at the default setting.
PermDBSize and System Performance
Database performance can be affected by the relationship of data block size to the type of work typically performed by the database:
- When database queries are tactical in nature, involving one or a few table rows, it is advantageous to have fewer rows stored per data block to speed data access. Online transaction processing (OLTP) is an example of this type of work.
- Alternatively, when database queries are strategic in nature, involving complex queries that involve many table rows per table, it is advantageous to have many rows stored in each data block, to minimize costly data I/O operations. Decision support software (DSS) and complex report generation are examples of this type of work.
PermDBSize sets the default maximum size used by the system for multirow data blocks in permanent tables. Use a larger value if the database is used primarily for strategic work, and a smaller value if the database is used primarily for tactical work.
In a mixed-work environment, determine a value for PermDBSize based on the kind of work typically performed by the database. For tables involved in other types of work, PermDBSize can be overridden on a table-by-table basis using the DATABLOCKSIZE option of the CREATE TABLE and ALTER TABLE SQL statements.
Adjusting DATABLOCKSIZE in Tables
You can control the default size for multirow data blocks for individual tables using the:
- DATABLOCKSIZE option
- MERGEBLOCKRATIO option
in the CREATE TABLE and ALTER TABLE statements, as follows.
|IF you specify…||THEN…|
|DATABLOCKSIZE in CREATE TABLE||the datablock can grow to the size specified in DATABLOCKSIZE instead of being limited to the global PermDBSize.
For any row, the system uses only the datablock size required to contain the row.
|DATABLOCKSIZE in ALTER TABLE||the datablocks can grow to the size specified in DATABLOCKSIZE where the row size requires the growth.
Whether they are adjusted to that new size gradually over a long period of time depends on the use of the IMMEDIATE clause.
|MERGEBLOCKRATIO in CREATE TABLE||it limits attempts to combine blocks if the resulting size is larger than the specified percent of the maximum multirow datablock size.|
|MERGEBLOCKRATIO in ALTER TABLE||the size of the resulting block when multiple existing blocks are being merged has a upper limit.
The limit depends on whether logically adjacent blocks are deemed mergeable with the single block being modified.
Blocks can still be initially loaded at the PermBDSize specification or the block size specified with the DATABLOCK option. Merges occur only during full table modifications.
MERGEBLOCKRATIO and the default size for multirow datablocks are unrelated.
|the IMMEDIATE clause||the rows in all existing datablocks of the table are repacked into blocks using the newly specified size. For large tables, this can be a time-consuming operation, requiring spool to accommodate two copies of the table while it is being rebuilt.
If you do not specify the IMMEDIATE clause, existing datablocks are not modified. As individual datablocks of the table are modified as a result of user transactions, the new value of DATABLOCKSIZE is used. Thus, the table changes over time to reflect the new block size.
To evaluate actual block sizes of tables, you can run the SHOWBLOCKS command of the Ferret utility. For more information on running this command, see Utilities. To specify the global data block size, use PermDBSize (see “PermDBSize” in Utilities).
You can also use SQL macros to generate SHOWBLOCKS-like information. The PopulateFsysInfoTable macro generates file system data to a table created by the CreateFsysInfoTable macro. The PopulateFsysInfoTable_ANSI macro generates file system data to a table created by the CreateFsysInfoTable_ANSI macro. For more information, see the chapter on file system information macros in SQL Functions, Operators, Expressions, and Predicates.
Specifying Maximum Data Block Size
You can set maximum block size two ways.
|Set PermDBSize via the DBS Control||When you set maximum block size at the system level, a table utilizes a value only until a new value is set system-wide.|
|Use the CREATE or ALTER TABLE command||When you set maximum block size at the table level, this value remains the same until you execute an ALTER TABLE command to change it.|
Larger block sizes enhance full-table scan operations by selecting more rows in a single I/O. The goal for DSS is to minimize the number of I/O operations, thus reducing the overall time spent on transactions.
Smaller block sizes are best used on transaction-oriented systems to minimize overhead by only retrieving what is needed.
For more information on data block size, see “PermDBSize” in Utilities.Rows cannot cross block boundaries. If an INSERT or UPDATE causes a block to expand beyond the defined maximum block size, the system splits the block into two or three blocks depending on the following.
|the new or changed row belongs in the beginning or end of a block||a block containing only that row is larger than the defined maximum block size for the table||the row is placed in a block by itself.|
|the new or changed row belongs in the middle of a block||a block containing only that row would be larger than the defined maximum size for the table||then a three-way block split is performed.
The existing block is split into two blocks at the point where the row being modified belongs and a new block is created between them containing only the modified row.
|the existing block size can be changed to accommodate modification and still not be larger than the defined maximum size for the table||case is empty||a single new block with the existing rows and the new or changed row is created.|
|the existing block size cannot be changed to accommodate modification and still not be larger than the define maximum size for the table||case is empty||the modification is applied and then the block is split into as many parts as required (and as few parts as possible) so that each part is not larger than the defined maximum for the table.|
Additional special rules exist that take precedence over the preceding rules. For example:
- Rows of different subtables never coexist in data blocks.
- Spool tables are almost always created as whole blocks all at once with many rows in them with the maximum size defined for spool (as long as there are enough rows).
Using Data Block Merge
The Datablock Merge operation reduces the number of small data blocks in table(s) by combining them with logically adjacent data blocks. It enables the Teradata file system to automatically merging smaller data blocks into a single large data block. Having fewer blocks reduces the number of I/O operations required to read and modify a large set of rows from disk. The Merge:
- Runs automatically.
- Does not require manual investigation of block size histograms.
- Does not require any specific AMP level locking.
- Searches continuously for small blocks to merge, even if some of those blocks do not require updates.
- Only affects the performance of workloads that modify data. Read-only workloads, insert operations into new subtables, and prime key modification requests are not affected.
Moreover, Datablock Merge is only applicable to permanent and permanent journal tables, not to global temporary or volatile tables.
For the DBS Control fields that support datablock merge, see Utilities.
For the option, MergeBlockRatio, which defines the size of the resulting block when multiple existing blocks are being merged, see “CREATE TABLE” and “ALTER TABLE” in SQL Data Definition Language Syntax and Examples.
Effects of Journal Data Block Size
Journal data block sizes may affect I/O usage. A larger journal size may result in less I/O or cause wasted datablock space. You can set the JournalDBSize. See Utilities.