Collecting Statistics on a Single-Table Join Index
As is true for hash indexes, you should collect the statistics on base table columns rather than on single-table join index columns for most applications.
An important exception to this guideline is the case where a single‑table join index is defined on a complex expression that is also specified as a predicate expression on the relevant base table column. Collecting statistics on the single‑table join index for those expressions enhances the ability of the Optimizer to estimate single‑table cardinalities for a query that specifies the base table expression in its predicate. See “Collecting Statistics on Join Indexes” on page 375 and SQL Request and Transaction Processing.
Note: If the join index is sparse (see “Sparse Join Indexes” on page 371), you should collect statistics on the join index itself rather than on its underlying base table.
See “Collecting Statistics on Base Table Columns Instead of Single-Table Join Index Columns” on page 377 for further information.
You can collect and drop statistics on the primary index columns of a single‑table join index using an INDEX clause to specify the column set.
For example, suppose a join index has the following definition.
CREATE JOIN INDEX OrdJIdx8 AS
SELECT o_custkey, o_orderdate
PRIMARY INDEX (o_custkey, o_orderdate)
ORDER BY (o_orderdate);
Then you can collect statistics on the index columns as a single object as shown in the following example:
COLLECT STATISTICS ON OrdJIdx8 INDEX (o_custkey, o_orderdate);
Note that you can only use columns specified in a multicolumn PRIMARY INDEX clause definition if you specify the keyword INDEX in your COLLECT STATISTICS request.
A poorer choice would be to collect statistics using the column clause syntax. To do this, you must perform two separate statements:
COLLECT STATISTICS ON OrdJIdx8 COLUMN (o_custkey);
COLLECT STATISTICS ON OrdJIdx8 COLUMN (o_orderdate);
It is always better to collect the statistics for multicolumn indexes on the index itself rather than individually on its component columns because its selectivity is much better when you collect statistics on the index columns as a set. Note that you can collect statistics on multiple unindexed columns as well. See “Collecting Statistics on Multiple Columns” on page 196.
For example, a query with a condition like
WHERE x=1 AND y=2 is better optimized if
statistics are collected on INDEX (x,y) than if they
are collected individually on column x and column y.
The same restrictions hold for the DROP STATISTICS statement.
See “Comparison of Hash and Single-Table Join Indexes” on page 323 for a list of the similarities between single‑table join indexes and hash indexes.