An SQL query, which includes DELETE, INSERT, MERGE, and UPDATE and SELECT, can affect one AMP, multiple AMPs, or all AMPs in the configuration.
| Query | Effect |
|---|---|
| Involves a single table and uses a unique primary index (UPI) | The row hash can be used to identify a single AMP. At most one row can be returned. |
| Involves a single table and uses a nonunique primary index (NUPI) | The row hash can be used to identify a single AMP. Any number of rows can be returned. |
| Uses a unique secondary index (USI) | One or two AMPs are affected (one AMP if the subtable and base table are on the same AMP). At most one row can be returned. |
| Uses a nonunique secondary index (NUSI) | If the table has a partitioned primary index (PPI) and the NUSI is the same column set as a NUPI, the query affects one AMP. Otherwise, all AMPs take part in the operation and any number of rows can be returned. |
The SELECT statements in subsequent examples reference the following table data.
| Employee | ||||||||
| Employee Number | Manager Employee Number | Dept. Number | Job Code | Last Name | First Name | Hire Date | Birth Date | Salary Amount |
| PK/UPI | FK | FK | FK | |||||
| 1006 | 1019 | 301 | 312101 | Stein | John | 961005 | 631015 | 2945000 |
| 1008 | 1019 | 301 | 312102 | Kanieski | Carol | 970201 | 680517 | 2925000 |
| 1005 | 0801 | 403 | 431100 | Ryan | Loretta | 1061015 | 650910 | 3120000 |
| 1004 | 1003 | 401 | 412101 | Johnson | Darlene | 1061015 | 760423 | 3630000 |
| 1007 | 1005 | 403 | 432101 | Villegas | Arnando | 1050102 | 770131 | 4970000 |
| 1003 | 0801 | 401 | 411100 | Trader | James | 960731 | 670619 | 3755000 |
| 1016 | 0801 | 302 | 321100 | Rogers | Nora | 980310 | 690904 | 5650000 |
| 1012 | 1005 | 403 | 432101 | Hopkins | Paulene | 970315 | 720218 | 3790000 |
| 1019 | 0801 | 301 | 311100 | Kubic | Ron | 980801 | 721211 | 5770000 |
| 1023 | 1017 | 501 | 512101 | Rabbit | Peter | 1040301 | 621029 | 2650000 |
| 1083 | 0801 | 619 | 414221 | Kimble | George | 1010312 | 810330 | 3620000 |
| 1017 | 0801 | 501 | 511100 | Runyon | Irene | 980501 | 611110 | 6600000 |
| 1001 | 1003 | 401 | 412101 | Hoover | William | 1010818 | 700114 | 2552500 |
The meanings of the abbreviations are as follows.
| Abbreviation | Meaning |
|---|---|
| PK | Primary Key |
| FK | Foreign Key |
| UPI | Unique Primary Index |
Single AMP Request
Assume that a PE gets the following SELECT statement:
SELECT last_name FROM Employee WHERE employee_number = 1008;
Because a unique primary index value is used as the search condition (the column employee_number is the primary index for the Employee table), PE1 generates a single AMP step requesting the row for employee 1008. The AMP step, with the PE identification, is put into a message, and sent using the BYNET to the relevant AMP (processor).
Flow Diagram of a Single AMP Request
Only one BYNET is shown to simplify the illustration.
As shown in the next diagram, AMP2 retrieves the row from disk, includes the row and the PE identification in a return message, and sends the message back to PE1 using the BYNET. PE1 accepts the message and returns the response row to the requesting application.
Flow Diagram of a Single AMP Response to Requesting PE
The following diagram shows a single AMP request with partition elimination.
All AMP Request
Assume PE1 gets a SELECT statement that specifies a range of primary index values as a search condition as shown in the following example:
SELECT last_name, employee_number FROM employee WHERE employee_number BETWEEEN 1001 AND 1010 ORDER BY last_name;
Each value hashes differently and all AMPs must search for the qualifying rows.
- Retrieve rows between 1001 and 1010
- Sort ascending on last_name
- Merge the sorted rows to form the answer set
PE1 then builds a message for each AMP step and puts that message onto the BYNET. Typically, each AMP step is completed before the next one begins. However queries may generate parallel steps.
When PE1 puts the message for the first AMP step on the BYNET, that message is broadcast to all processors as the following diagram shows.
- All AMPs accept the message, but the PEs do not.
- Each AMP checks for qualifying rows on its disk storage units.
- If any qualifying rows are found, the data in the requested columns is converted to the client format and copied to a spool file.
- Each AMP completes the step, whether rows were found or not, and puts a completion message on the BYNET.
The completion messages flow across the BYNET to PE1.
- When all AMPs have returned a completion message, PE1 transmits a message containing AMP Step 2 to the BYNET.
On receipt of Step 2, the AMPs sort their individual answer sets into ascending sequence by last_name, as the following diagram shows.
If partitioned on employee_number, the scan may be limited based on partition elimination. - Each AMP sorts its answer set, then puts a completion message on the BYNET.
- After receiving all completion messages for Step 2, PE1 sends a message containing AMP Step 3.
- On receipt of Step 3, each AMP copies the first block from its sorted spool to the BYNET.
Because there can be multiple AMPs on a single node, each node may be required to handle sort spools from multiple AMPs (see the following diagram).
- Nodes with multiple AMPs sort spools generated by local AMPs.
When the local sort is complete on each node, the lowest sorting row from each node is sent over the BYNET to PE1. From this point on, PE1 acts as the Merge coordinator among all the participating nodes.
- The Merge continues with PE1 building a globally sorted buffer.
When this buffer fills, PE1 forwards the buffer to the application and begins building subsequent buffers.
- When a participant node has exhausted its sort spool, the node sends a Done message to PE1.
This causes PE1 to prune this node from the set of Merge participants.
When there are no remaining Merge participants, PE1 sends the final buffer to the application with an End Of File message.
Partition Elimination
A PPI can increase query efficiency through partition elimination, where partitions that cannot contain qualifying rows can be skipped.
The database supports the following types of partition elimination:
| Type | Description |
|---|---|
| Static | Based on constant conditions such as equality or inequality on the partitioning columns. |
| Dynamic | The partitions to eliminate cannot be determined until the query is run and the data is scanned. |
| Delayed | Occurs with conditions comparing a partitioning column to a USING variable or built-in function such as CURRENT_DATE, where the Optimizer builds a generalized plan for the query but delays partition elimination until specific values of USING variables and built-in functions are known. |
- Partitioning expressions for the primary index of the table
- Conditions in the query
- Ability of the Optimizer to detect partition elimination
- If a query specifies values for all the primary index columns, the AMP where the rows reside can be determined and only a single AMP is accessed.
If conditions are not specified on the partitioning columns, each partition can be probed to find the rows based on the hash value.
If conditions are also specified on the partitioning columns, partition elimination may reduce the number of partitions to be probed on that AMP.
- If a query does not specify the values for all the primary index columns, an all-AMP full file scan is required for a table with an NPPI.
However, with a PPI, if conditions are specified on the partitioning columns, partition elimination may reduce an all-AMP full file scan to an all-AMP scan of only the non-eliminated partitions.
Single AMP Request with Partition Elimination
If a SELECT specifies values for all the primary index columns, the AMP where the rows reside can be determined and only a single AMP is accessed.
If conditions are also specified on the partitioning columns, partition elimination may reduce the number of partitions to be probed on that AMP.
Suppose the Employee table is defined with a single-level PPI where the partitioning column is dept_number.
Assume that a PE gets the following SELECT statement:
SELECT last_name FROM Employee WHERE employee_number = 1023 AND dept_number = 501;
The following flow diagram shows this process.
The AMP Step includes the list of partitions (P3) to access. Partition elimination (static partition elimination) reduces access to the partitions that satisfy the query requirements. In each partition in the list (only P3), look for rows with a given row hash value of the PI.
Partition elimination is similar for the Employee table with a multilevel PPI where one partitioning expression uses the dept_number column and another partitioning expression uses the hire_date column.
Assume that a PE gets the following SELECT statement:
SELECT last_name FROM Employee WHERE employee_number = 1023 AND dept_number = 501 AND hire_date BETWEEN DATE '2006-01-01' AND DATE '2006-12-31'
The following flow diagram shows this process.
No one was hired in department number 401 in 2006, so partition P4 is empty.