Performance Aspects of Merging Data Blocks | CREATE TABLE | Teradata Vantage - Performance Aspects of Merging Data Blocks - Advanced SQL Engine - Teradata Database

SQL Data Definition Language Detailed Topics

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
imq1591724555718.ditamap
dita:ditavalPath
imq1591724555718.ditaval
dita:id
B035-1184
lifecycle
previous
Product Category
Teradata Vantage™

Merging small data blocks into larger data blocks only affects the performance of data modification workloads that scan an entire table, so there is no direct effect on SELECT statements. However, because there are fewer data blocks to scan after data block merges complete, there is an indirect positive effect on the performance of all future full-table scan and full-table modification requests. This occurs because full-table scan operations are mostly I/O bound, and reducing the number of data blocks in the table has a direct effect on the I/O load.

Merging data blocks can have both positive and negative performance effects for table modification operations, though the negative effects should be negligible once the majority of the small data blocks in a table have been merged into larger data blocks.

For example, consider a table that has overly small data blocks. This generally occurs either because of a reconfiguration or because you have deleted a substantial number of rows from the table.

When you submit the first full-table modification request on this table after data block merging has been enabled, there is a higher cost to merging data blocks than there would have been if the feature had not been enabled. This occurs because of all the work required to merge large number of small data blocks.

When you submit a second full-table modification request on this table, it could either be more expensive or less expensive than the same operation would have been without data block merging enabled. The factors involved in this uncertainty of this expense are the following.
  • The operation might be more expensive if there are a number of data blocks remaining that were not merged during the first full-table modification.
  • The operation might be less expensive if a substantial number of data blocks were merged during the first full-table modification.
  • The cost of the operation might be a mix of both of these factors.

These factors continue to apply until you have performed enough full-table modifications that the table reaches a point where the majority of its data blocks do not need to be merged. At this point, only the second bullet in the previous list applies, and any full-table modifications should continue to perform significantly better than they would have if you had never enabled data block merges.

Tables with the smallest number of data blocks should provide the best performance during full-table scan modification operations that are mostly I/O bound.

Full-table modifications that are done early in the life of a table with data block merging enabled tend to demonstrate a negative I/O performance impact, as noted in the first bullet because more data blocks are likely to be written or read (or both) from disk more frequently than you would see if this feature were not enabled. However, later full-table scan modifications of such a table are likely to show improvements in I/O operations because the table has a smaller number of blocks to scan or modify, as noted by the second bullet. You should see similar behavior for index subtable updates (with the exception of NUSI subtable updates, which are not affected by merging smaller data blocks into a larger data block).

You also should not see a performance impact for tables on which the principal operation is to append new data to the end of a subtable, because this feature does not scan or update sets of data blocks in the parts of a table that would not otherwise be scanned by the underlying update request.

No more than 8 data blocks can be merged during a single data block merge operation. By default, the resulting merged data block cannot be greater than or equal to 60% of the maximum multirow data block size for a table. This percentage threshold defines the merge block ratio for a table.

The performance of operations on tables that are defined with a minimal block size is not enhanced by merging small data blocks. This is because there are fewer opportunities to merge data blocks in such tables given that the size of most of their data blocks is already close to the merge block ratio threshold. Furthermore, it is unlikely that a database schema that is carefully designed to process its heaviest workloads would be affected by merging smaller data blocks anyway, because you will have already tuned the block sizes of your tables to optimize the most common update operations that common workloads make against them.

You can adjust the merge block ratio for an individual table by modifying the value for MERGEBLOCKRATIO using an ALTER TABLE statement. See ALTER TABLE (Basic Table Parameters).

You also should not see a significant performance impact for INSERT and INSERT … SELECT operations into unpopulated tables because in such a case you are only creating new rows and there are no preexisting blocks to examine for merging. However, INSERT … SELECT operations into populated tables are likely to experience degraded performance for the initial series of update operations on the table. But this should be offset by improved performance for later series of update operations.

Note that when an INSERT … SELECT operation occurs for the case where the source and destination tables have different primary indexes, the rows must be redistributed across the AMPs, so the operation does not proceed smoothly from the beginning of the table to its end. Because this case does not simply append rows to the end of a table, it is possible that you might see degraded performance.

If you find that merging small data blocks for a large number of tables is exerting an unwieldy performance burden on your system, it is possible to deactivate the feature globally by changing the setting of the DisableMergeBlocks DBS Control parameter from its default value of FALSE to TRUE rather than performing a large number of ALTER TABLE statements to eliminate the merge block ratio value on a table-by-table basis. For more information, see Teradata Vantage™ - Database Utilities, B035-1102.