Several approaches to physical database design can make one-AMP or few-AMP query plans more likely. Most of these choices need to be made carefully, because they also influence how other work in the system is optimized. All applications running in the database need to be considered, particularly when selecting or changing primary index columns:
- Use the Same Primary Index Definitions
When you expect frequent joins between two associated tables, consider using identical column sets for the primary index definitions of both tables. In other words, define the primary indexes and the join columns are the columns. This technique can enhance both decision support and tactical queries, depending on the frequency of the join and the demographics of the data.
- Increase the Likelihood of Few-AMP Nested Joins
Consider placing USIs on the join constraint of one of the tables. From the perspective of logical database design, you are placing the USI on the primary key of the primary table in the primary key-foreign key relationship.
- Consider a Join Index
You can create a single-table join index or hash index with a different primary index than the base table. For example, you could define a primary index for the join or hash index composed of the column that corresponds to values frequently specified by the application. The example in the following diagram indicates how a query that only has a value for customer name could use the join index for single-AMP access.
- Consider a Global Join Index
A global join index is a single-table join index, similar to the one illustrated above, except for one important difference: each global join index row contains a pointer to its base table that the Optimizer can use as an alternate way to access base table rows. Using a join index for base table access is referred to as partial covering because such a join index only partially covers the query (see Partial Query Coverage and Restrictions on Partial Covering by Join Indexes).
Global join indexes offer the combined advantages of a NUSI (by supporting duplicate rows per value) and a USI (hashed index rows), and the Optimizer can often take advantage of these capabilities for Group AMP operations.
The following graphic indicates how the query first accesses the global join index, then uses the rowID information from the index to access the base table rows:
Suppose you create the following base table:
CREATE MULTISET TABLE adw.customer ( c_custkey DECIMAL(18,0) NOT NULL, c_name VARCHAR(30) CHARACTER SET LATIN CASESPECIFIC NOT NULL, c_address VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, c_nationkey DECIMAL(18,0) NOT NULL, c_phone CHARACTER(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, c_mktsegment CHARACTER(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, c_comment CHARACTER(100) CHARACTER SET LATIN CASESPECIFIC NOT NULL) UNIQUE PRIMARY INDEX ( c_custkey ) INDEX ( c_nationkey );
To support group AMP access to the data in this table, you create the following global join index:
CREATE JOIN INDEX adw_ji AS SELECT (c_phone), (ROWID) FROM customer PRIMARY INDEX(c_phone);
A typical query against the customer table might be something like the following:
SELECT c_name, c_mktsegment FROM customer WHERE c_phone = '5363333428';
To make the Optimizer aware of the opportunity for using group AMP access to respond to this query, you must first collect statistics on the c_phone column of the base table. Otherwise, the Optimizer still uses the global join index, but generates an all-AMP plan instead of the more cost effective group AMP plan.
The main advantages of global join indexes are scalability and throughput. Whether a global join index supports faster processing than a NUSI largely depends on how busy the system is at the time the query is submitted. On a system with a light load, a query supported by a NUSI might run slightly faster, because a NUSI scan is one step that all AMPs perform in parallel. To use a global join index, the plan requires two steps to perform the same operation.
When a system is heavily loaded, a global join index is likely to provide a performance advantage, all things being equal, because it does not impel the overhead of an all-AMP operation. The higher the number of AMPs involved in satisfying a request, the higher the likelihood of experiencing resource contention and experiencing response delays, and the difference is more pronounced as a configuration grows in size. As a result, the practical benefits accrued from using a global join index rather than a NUSI also increase.
Because global join indexes can partially cover a query, some, or even most, of the columns requested by a query need not be defined in the join index itself. Multitable join indexes also support partial covering, though aggregate join indexes do not. Frequently only the primary index of the global join index is carried, along with the unique identifier of the base table it supports.
You can specify any of the following unique identifiers in the definition of a global join index:
- The row ID of the base table (expressed as the keyword ROWID).
- The primary index of the base table.
- A unique secondary index on the base table.