Using HASHROW
HASHROW is especially useful for evaluating the following things.
As shown in the following example set, you can use the output from HASHROW to compute the average number of rows that cause hash collisions.
Example
This example uses the HASHROW function on a unique primary index.
Assume that the emp_no column is the UPI for the employee table. To find the number of hash collisions generated by the values in emp_no, submit the following request.
SELECT(COUNT(*)(FLOAT))/(COUNT(DISTINCT HASHROW(emp_no)))
FROM employee;
If there are no hash collisions, the result ratio is close to 1. The larger the returned value, the more hash collisions occur and the less efficient the analyzed column expression analyzed is as a primary index.
If a proposed primary index contains non‑unique columns, perform the following check.
1 Submit the request in the example against the proposed primary index column expression and save the result.
2 Resubmit the request against a column expression where each index value is unique and save the result.
3 Compare the result step 1 against the result of step 2.
The comparison should indicate clearly which index generates fewer hash collisions.
Example
The following example uses the HASHROW function against a the same primary index column analyzed in “Example 1” on page 192, but retyped as DECIMAL(2,2).
1 Determine how many hash collisions occur with the existing data type by submitting the same query used in “Example 1” on page 192.
SELECT (COUNT(*)(FLOAT))/COUNT(DISTINCT HASHROW (emp_no))
FROM employee;
2 Determine whether hash collisions would be reduced by changing the data type to DECIMAL(4,3) by submitting the following two requests (both are required because of the restrictions on the DISTINCT option).
SELECT COUNT(DISTINCT emp_no(DECIMAL(4,3)))
FROM employee;
SELECT COUNT(DISTINCT HASHROW(emp_no(DECIMAL(4,3)))
FROM employee;
3 Divide the numbers returned by the two requests to calculate the average number of hash collisions that occur with the new data type.
4 Compare this average with the result of the first request to determine if the change reduces the collision count.
Example
When used together, HASHAMP, HASHBUCKET, and HASHROW indicate the distribution of primary rows among the AMPs.
For example, the following query shows the row distribution of the employee table based on the primary index emp_no, from which you can quickly see whether distribution is even or uneven.
SELECT HASHAMP(HASHBUCKET(HASHROW(emp_no))),COUNT (*)
FROM employee
GROUP BY 1
ORDER BY 1;
IF the distribution is … |
THEN … |
uneven |
try one of the following. |
even |
create your production table defining the primary index or primary AMP index based on the most high-performing column set. For example, either of the following examples where preliminary performance results are based on trials on a test system. |