This section provides a start-to-finish example of the Index Wizard steps described in the previous topics. Although sophisticated users can manually code and execute the SQL requests shown in this example, the majority of users are likely to use the Index Wizard client interface.
Note that if you were doing this example using the Index Wizard utility, it would call the identical SQL requests that are shown here. The only difference is the GUI interface presented by the client utility instead of the SQL Assistant or BTEQ interface you would use to submit those SQL requests.
This example assumes that you have created the following 2 tables:
CREATE TABLE sales_history (
product_code CHARACTER(8),
store_number INTEGER,
transaction_date DATE,
quantity_sold INTEGER,
other_columns CHARACTER(50))
PRIMARY INDEX (product_code, store_number, transaction_date);
CREATE TABLE products
product_code CHARACTER(8),
description VARCHAR(50)
product_category CHARACTER(10),
price DECIMAL(10,2),
quantity_avail INTEGER)
PRIMARY INDEX (product_code);
Recall that this is a multistage process consisting of the following high level steps that must be performed in the order indicated. Apart from stage 8, you should perform this analysis on a test system, not your production system.
1 Capture the query information
2 Create the workload.
3 Associate the workload requests with the workload identifier.
4 Analyze the workload.
This is a two-step process. The two stages can be performed in any order.
5 View the recommendations made by the index and partitioning expression analyses.
6 View the estimated improvement in query costs that can be accrued by implementing the index and partitioning expression recommendations.
7 Validate the recommendations.
This is an six-stage process consisting of the following stages:
a Turn on validation mode for the session.
b Simulate creating the recommended indexes and partitioning expression.
c Simulate collecting the recommended statistics on the recommended indexes and partitioning columns.
d Capture the new query plans for the queries in the workload.
e Turn off validation mode for the session.
f Analyze the steps in the captured plans for cost reductions and increased partition elimination.
8 Execute the validated recommendations.
The next 8 topics provide concrete examples of these stages of the Index Wizard process.
In the first stage of the process, you use INSERT EXPLAIN requests to capture workload information into your QCD database.
For this example, you first capture data for the query that gives year-to-date sales quantities for each product. Because you specify the WITH STATISTICS option, the system automatically collects statistics on the transaction_date column, which is one of the three columns making up the composite NUPI for table sales_history.
INSERT EXPLAIN WITH STATISTICS INTO qcd AS q1
SELECT product_code, SUM(quantity_sold)
FROM sales_history
WHERE transaction_date >= DATE '2006-01-01'
GROUP BY product_code;
Next, you capture data for the query that returns various summarized quantities for a particular value of the product_category column for table products. Again, because you specify the WITH STATISTICS option, the system automatically collects statistics on the product_category column, which is not an index column for the table.
INSERT EXPLAIN WITH STATISTICS INTO qcd AS q2
SELECT SUM(quantity_avail), AVG(quantity_avail),
MIN(quantity_avail)
FROM products
WHERE product_category = ‘Laptop’;
In the second stage of the process, you use the Index Wizard-supplied macro QCD.addWorkload to create a new workload for your queries.
EXEC qcd.addWorkload(‘MyWorkload’);
In the third stage of the process, you associate the two queries in the workload with the newly created workload identifier MyWorkload.
INSERT INTO qcd.workloadqueries
SELECT workload.workloadid, query.queryid, 1
FROM qcd.workload, qcd.query
WHERE workload.workloadname = ‘MyWorkload’
AND query.queryname IN ( ‘q1’, ‘q2’);
The fourth stage of the process consists of two independent stages: index analysis and partitioning expression analysis.
Arbitrarily, you begin by performing an index analysis on the workload looking for NUSIs, as indicated by the WITH INDEX TYPE option specification of index type 4, which represents NUSIs.
INITIATE INDEX ANALYSIS
FOR MyWorkload IN qcd AS MyIndexes
WITH INDEX TYPE 4;
Equally arbitrarily, you then perform a partitioning expression analysis on the workload.
INITIATE PARTITION ANALYSIS
FOR MyWorkload IN qcd AS MyPPIs;
You could just as easily have done the partitioning expression analysis first and the index analysis second.
In the fifth stage of the process, you first view the NUSIs recommended by the index analysis:
SELECT queryname, indexddl
FROM qcd.indexrecommendations AS ir,
qcd.workload AS wl,
qcd.query AS q
WHERE ir.workloadid = wl.workloadid
AND ir.queryid = q.queryid
AND workloadname = 'MyWorkload'
AND ir.indexnametag = 'MyIndexes';
Assume the following output from this query:
queryname indexddl
--------- --------
Q2 CREATE INDEX (product_category) ON Products
Next you view the partitioning expression recommended by the partition analysis.
SELECT queryname, expressiontext
FROM qcd.partitionrecommendations AS pr,
qcd.workload AS wl,
qcd.query AS q
WHERE pr.workloadid = wl.workloadid
AND pr.queryid = q.queryid
AND workloadname = 'MyWorkload'
AND pr.resultnametag = 'MyPPIs';
Assume the following output from this query. The formatting of the output is arbitrary and might not correspond with the actual output generated by your system.
queryname expressiontext
--------- --------------
Q1 PARTITION BY RANGE_N(transaction_date
BETWEEN DATE '2000-01-01'
AND DATE '2006-06-30'
EACH INTERVAL '7' DAY);
In the sixth stage of the process, you view the estimated query cost improvements when using the recommended partitioning expression for this workload.
SELECT queryname, originalcost, newcost
FROM qcd.partitionrecommendations AS pr,
qcd.workload AS wl,
qcd.query AS q
WHERE pr.workloadid = wl.workloadid
AND pr.queryid = q.queryid
AND workloadname = 'MyWorkload'
AND pr.resultnametag = 'MyPPIs';
Assume the following output from this query:
queryname originalcost newcost
--------- ------------ ------------
Q1 4.530000E 004 2.580000E 002
In the seventh stage of the process, you validate the recommendations made by the analysis.
You can optionally validate the recommendations on your production system.
DIAGNOSTIC “VALIDATE INDEX” ON FOR SESSION:
CREATE INDEX (product_category) ON Products;
ALTER TABLE sales_history
MODIFY PRIMARY INDEX PARTITION BY RANGE_N(transaction_date
BETWEEN DATE '2000-01-01'
AND DATE '2006-06-30'
EACH INTERVAL '7' DAY);
Note that this ALTER TABLE request is not valid except when you are in Validation mode.
COLLECT STATISTICS FOR SAMPLE 100 PERCENT
INTO qcd ON sales_history COLUMN(transaction_date);
The system collects statistics on the underlying column data.
COLLECT STATISTICS ON sales_history COLUMN(transaction_date);
INSERT EXPLAIN INTO qcd AS q1_validate
SELECT product_code, SUM(quantity_sold)
FROM sales_history
WHERE transaction_date >= DATE '2006-01-01'
GROUP BY product_code;
INSERT EXPLAIN INTO qcd AS q2_validate
SELECT SUM(quantity_avail), AVG(quantity_avail),
MIN(quantity_avail)
FROM products
WHERE product_category = ‘Laptop’
DIAGNOSTIC “VALIDATE INDEX” NOT ON FOR SESSION;
SELECT steptext
FROM qcd.querystepsview
WHERE QueryId =
(SELECT MAX(QueryId)
FROM qcd.query
WHERE queryname = ‘q1_validate’)
ORDER BY stepnum;
Assume the following partial output from this query:
steptext
--------
3) We do an all-AMPs SUM step to aggregate from 26 partitions of “DFES.SalesHistory with a condition of ("DFES.SalesHistory.transaction_date >= DATE '2006-01-01’)”
You can perform a similar query for Q2 as well. That query is omitted from this example because it is essentially redundant.
SELECT stepnum, stepkind, cost
FROM qcd.querystepsview
WHERE cost > 0
AND QueryId =
(SELECT MAX(QueryId)
FROM qcd.query
WHERE queryname = ‘q1_validate’)
ORDER BY stepnum;
Retrieve the recommended SQL DDL to recreate the table with the recommended partitioning expression.
SELECT recreatetext
FROM qcd.partitionrecommendations AS pr,
qcd.workload AS wl,
qcd.query AS q
WHERE pr.workloadid = wl.workloadid
AND pr.queryid = q.queryid
AND workloadname = 'MyWorkload'
AND pr.resultnametag = 'MyPPIs';
Assume this request returns the following result. The formatting of this output is arbitrary and might not correspond with the actual output generated by your system.
recreatetext
------------
CREATE TABLE sales_history AS (
SELECT *
FROM sales_history)
WITH NO DATA;
ALTER TABLE sales_history
MODIFY PRIMARY INDEX PARTITION BY
RANGE_N(transaction_date
BETWEEN DATE '1997-01-01'
AND DATE '2006-12-31'
EACH INTERVAL '1' DAY);
INSERT INTO PPI_sales_history
SELECT *
FROM sales_history;
DROP TABLE sales_history;
RENAME TABLE PPI_sales_history TO sales_history;