Rules for Using HASH BY or LOCAL ORDER BY to Insert Rows
The following rules apply:
If a NoPI target table has row partitioning that is different than the source table, the system sorts the source rows locally on the internal partition number, then copies the rows locally into the target table. The internal partition number is computed based on the row‑partitioning of the target table.
If you also specify a LOCAL ORDER BY clause or if the target table is row-partitioned, the system orders the rows locally and inserts them locally into the target table or underlying target view table.
This is useful if the result of the subquery does not provide an even distribution of the rows. If the target table or underlying target view table is also column‑partitioned, the locally ordered hash redistribution might also be useful to distribute equal values of a column to the same AMP, which might then enable effective autocompression of the column partitions with the columns on which the hash value is calculated.
Because the object you specify for a HASH BY clause is an expression, the expression can be a function call such as RANDOM(n, m).
HASH BY RANDOM(1, 2000000000) is useful to redistribute each individual selected row when there is no particular column set on which to hash distribute the rows, and when a more even distribution is needed than the HASH BY RANDOM clause provides.
A poor choice of a hash expression can lead to a very uneven distribution, similar to when a poor choice is made for primary index columns.
If you also specify a LOCAL ORDER BY clause, the system orders the rows locally and inserts them locally into the target table or underlying target view table.
This is useful if the result of the subquery does not provide an even distribution of rows. Distributing data blocks is more efficient than distributing individual rows and usually provides a nearly even distribution. However, distributing individual rows using an expression like HASH BY RANDOM(1,2000000000) can provide a more even distribution of rows, which might be necessary in some cases.
If the target table or underlying target view table is also column‑partitioned, this might allow for more effective autocompression of the column partitions with the columns on which the ordering is done.