Sparse Join Indexes and Query Optimization
When a user query is entered, the Optimizer determines if accessing j1 gives the correct answer and is more efficient than accessing the base tables. This sparse join index would be selected by the Optimizer only for queries that restricted themselves to data from the year 2000. For example, a query might require data from June of 2000. Because the join index j1 contains all of the data for year 2000, it might be used to satisfy the following query:
SELECT store_id, dept_id, SUM(sales_dollars) AS sum_sd
FROM sales
WHERE sales_date >= 6/1/2000
AND sales_date < 7/1/2000
GROUP BY storeid, dept_id;
As another example, the following DDL creates a join index containing only those customers living in four western states of the US:
CREATE JOIN INDEX westcust AS
SELECT cust.id, cust.address, donations.amount
FROM cust, donations
WHERE cust.cust_id = donations.cust_id
AND donations.d_date > '2003/01/01'
AND cust.state IN ('CA', 'OR', 'WA','NV');
The Optimizer can use this join index for the following query written to find all donors from California with donations of 1,000 USD or more made since January 1, 2003.
SELECT custid, cust.address
FROM cust, donations
WHERE donations.amount > 1000
AND cust.state = 'CA'
AND donations.d_date > '2003/01/01';
Because the customers and donations considered by the query are part of the subset included in the join index, the Optimizer uses it to answer the query. This can save a great deal of time especially in situations where the base tables are very large, but queries typically look only at subsets of the tables.
Maintenance of sparse join indexes can be much faster than maintenance for other join indexes because the sparse types have fewer values, and so are generally updated much less frequently.
See “Sparse Join Indexes and Tactical Queries” on page 817 for information about design issues specific to sparse join index support for tactical queries.