Table Definitions
Suppose you define the following customer and orders tables.
CREATE TABLE customer ( c_custkey INTEGER NOT NULL, c_name CHARACTER(26) NOT NULL, c_address VARCHAR(41), c_nationkey INTEGER, c_phone CHARACTER(16), c_acctbal DECIMAL(13,2), c_mktsegment CHARACTER(21), c_comment VARCHAR(127)) PRIMARY INDEX(c_custkey); CREATE TABLE orders ( o_orderkey INTEGER NOT NULL, o_date DATE FORMAT ‘yyyy-mm-dd’, o_status CHARACTER(1), o_custkey INTEGER, o_totalprice DECIMAL(13,2), o_orderpriority CHARACTER(21), o_clerk CHARACTER(16), o_shippriority INTEGER, o_comment VARCHAR(79)) UNIQUE PRIMARY INDEX(o_orderkey);
Example Query Request
Consider the following SELECT request against these tables.
SELECT o_custkey, c_name, o_status, o_date, o_comment FROM orders, customer WHERE o_custkey=c_custkey;
The next few topics examine the query plan for this SELECT request: first without and then with a join index.
Query Plan: No Join Index Defined
Without a defined join index, the execution plan for this query would typically redistribute the orders table into a spool, sort the spool on o_custkey, and then perform a merge join between the spool and the customer table.
Query Plan: Join Index Defined
Now consider the execution plan for this same query when the following join index has been defined:
CREATE JOIN INDEX OrdCustIdx AS SELECT (o_custkey, c_name), (o_status, o_date, o_comment) FROM orders, customer WHERE o_custkey=c_custkey;
With this join index defined, the execution plan for the query specifies a simple scan of the join index without accessing any of the underlying base tables and without having to join them on the predicate WHERE o_custkey = c_custkey.
In the join index defined for this example, (o_custkey, c_name) is the specified fixed part of the index and (o_status, o_date, o_comment) is the repeated portion. Therefore, assume the following specimen base table entries (where the ? character indicates a null).
CustKey | Name | Address |
---|---|---|
100 | Robert | San Diego |
101 | Ann | Palo Alto |
102 | Don | El Segundo |
OrderKey | Date | Status | CustKey | Comment |
---|---|---|---|---|
5000 | 2004-10-01 | S | 102 | rush order |
5001 | 2004-10-01 | S | 100 | big order |
5002 | 2004-10-03 | D | 102 | delayed |
5003 | 2004-10-05 | U | ? | unknown customer |
5004 | 2004-10-05 | S | 100 | credit |
You cannot collect statistics on a complex expression from a base table. If your applications frequently run queries that specify complex expressions in their predicates, you should consider creating a single-table join index that specifies a matching complex expression in its select list or column list, respectively. When Vantage creates the index, it transforms the complex expression into a simple index column on which you can collect statistics.
If the complex expression specified by the index is a term that matches a predicate condition for a query made against the base table the index is defined on, statistics collected on the index expression can be mapped to the base table so the Optimizer can use them to make more accurate single-table cardinality estimates.
Materialized Join Index
The materialized logical join index rows are the following:
Fixed Part | Repeated Part | |||
---|---|---|---|---|
CustKey | Name | Status | Date | Comment |
100 | Robert | S | 2004-10-01 | big order |
S | 2004-10-05 | credit | ||
101 | Ann | P | 2004-10-08 | discount |
102 | Don | S | 2004-10-01 | rush order |
D | 2004-10-03 | delayed |
Note that the information for the null customer is not included in this join index because it was defined using an inner join.