The number of hits per block is the number of data block rows accessed (inserted, deleted, or updated) during a table operation. Typically, the more hits per block, the better the performance, if the hits can be combined into one data block update (for example, with an INSERT ... SELECT for a set of updates containing multiple rows).
If a large number of the data blocks for a table have become significantly smaller than half the maximum size for the defined maximum data block size, an appropriate specification for the MERGEBLOCKRATIO option of the ALTER TABLE and CREATE TABLE statements can enable the file system to merge up to 8 data blocks into a single larger data block. For more information, see ALTER TABLE and CREATE TABLE Syntax.
However, if each data block hit involves a separate read operation, with hits occurring semi-randomly, performance is worse.
Data Processing Scenarios for the Hit Rates Studied
The following table provides data processing scenarios that correspond with the hit rates studied.
| Number of Hits per Data Block | Percentage of Rows Touched | Scenario |
|---|---|---|
| 1 | 0.2 | Update a table that contains 500 days of data (roughly 1.5 years) with one typical day of data. |
| 5 | 1.0 | Insert into a table that contains 104 weeks of data (2 years) one typical week of data. |
| 20 | 4.0 | Delete from a table that contains 104 weeks of data (2 years) one typical month (4 weeks) of data. |
Standard Test Procedure
The standard procedure for these tests was to make any maintenance changes to the left table in the join.
Maintenance Costs in Terms of Elapsed Time
Elapsed times increase as a function of increased hits per data block because more rows are touched. At the same time, the CPU path per row times decrease because the elapsed times increase at a lesser rate than the increase in the number of rows touched.
Maintenance Costs in Terms of CPU Path Length Per Transaction
Suppose you report the same information in terms of CPU path length per transaction. For this data, the term transaction equates to qualifying row. This number is a measure of the amount of CPU time a transaction requires; roughly, the number of instructions performed per transaction.
- CPU path per transaction figures include both CPU usage information and elapsed time.
- The measure is normalized, and therefore factors out configuration-specific variables such as number of CPUs per node and number of nodes, making the numbers comparable across all configurations.