15.10 - Using the Index Wizard: Extended Example - Teradata Database

Teradata Database SQL Request and Transaction Processing

prodname
Teradata Database
vrm_release
15.10
category
Programming Reference
User Guide
featnum
B035-1142-151K

This section provides a start-to-finish example of the Index Wizard steps described in the previous topics. Although sophisticated users can manually code and execute the SQL requests shown in this example, the majority of users are likely to use the Index Wizard client interface.

Note that if you were doing this example using the Index Wizard utility, it would call the identical SQL requests that are shown here. The only difference is the GUI interface presented by the client utility instead of the SQL Assistant or BTEQ interface you would use to submit those SQL requests.

This example assumes that you have created the following 2 tables:

  CREATE TABLE sales_history (
    product_code CHARACTER(8),
    store_number INTEGER,
    transaction_date DATE,
    quantity_sold INTEGER,
    other_columns CHARACTER(50))
  PRIMARY INDEX (product_code, store_number, transaction_date);
  CREATE TABLE products
    product_code CHARACTER(8),
    description VARCHAR(50)
    product_category CHARACTER(10),
    price DECIMAL(10,2),
    quantity_avail INTEGER)
PRIMARY INDEX (product_code);

Recall that this is a multistage process consisting of the following high level steps that must be performed in the order indicated. Apart from stage 8, you should perform this analysis on a test system, not your production system.

1 Capture the query information

2 Create the workload.

3 Associate the workload requests with the workload identifier.

4 Analyze the workload.

This is a two-step process. The two stages can be performed in any order.

  • Do an index analysis on the workload.
  • Do a partitioning expression analysis on the workload.
  • 5 View the recommendations made by the index and partitioning expression analyses.

    6 View the estimated improvement in query costs that can be accrued by implementing the index and partitioning expression recommendations.

    7 Validate the recommendations.

    This is an six-stage process consisting of the following stages:

    a Turn on validation mode for the session.

    b Simulate creating the recommended indexes and partitioning expression.

    c Simulate collecting the recommended statistics on the recommended indexes and partitioning columns.

    d Capture the new query plans for the queries in the workload.

    e Turn off validation mode for the session.

    f Analyze the steps in the captured plans for cost reductions and increased partition elimination.

    8 Execute the validated recommendations.

    The next 8 topics provide concrete examples of these stages of the Index Wizard process.

    In the first stage of the process, you use INSERT EXPLAIN requests to capture workload information into your QCD database.

    For this example, you first capture data for the query that gives year-to-date sales quantities for each product. Because you specify the WITH STATISTICS option, the system automatically collects statistics on the transaction_date column, which is one of the three columns making up the composite NUPI for table sales_history.

      INSERT EXPLAIN WITH STATISTICS INTO qcd AS q1
        SELECT product_code, SUM(quantity_sold)
        FROM sales_history
        WHERE transaction_date >= DATE '2006-01-01'
        GROUP BY product_code;

    Next, you capture data for the query that returns various summarized quantities for a particular value of the product_category column for table products. Again, because you specify the WITH STATISTICS option, the system automatically collects statistics on the product_category column, which is not an index column for the table.

      INSERT EXPLAIN WITH STATISTICS INTO qcd AS q2
        SELECT SUM(quantity_avail), AVG(quantity_avail),
               MIN(quantity_avail)
        FROM products
        WHERE product_category = ‘Laptop’;

    In the second stage of the process, you use the Index Wizard-supplied macro QCD.addWorkload to create a new workload for your queries.

    EXEC qcd.addWorkload(‘MyWorkload’);

    In the third stage of the process, you associate the two queries in the workload with the newly created workload identifier MyWorkload.

      INSERT INTO qcd.workloadqueries
      SELECT workload.workloadid, query.queryid, 1
      FROM qcd.workload, qcd.query
      WHERE workload.workloadname = ‘MyWorkload’
      AND query.queryname IN ( ‘q1’, ‘q2’);

    The fourth stage of the process consists of two independent stages: index analysis and partitioning expression analysis.

    Arbitrarily, you begin by performing an index analysis on the workload looking for NUSIs, as indicated by the WITH INDEX TYPE option specification of index type 4, which represents NUSIs.

      INITIATE INDEX ANALYSIS
      FOR MyWorkload IN qcd AS MyIndexes
      WITH INDEX TYPE 4;

    Equally arbitrarily, you then perform a partitioning expression analysis on the workload.

      INITIATE PARTITION ANALYSIS
      FOR MyWorkload IN qcd AS MyPPIs;

    You could just as easily have done the partitioning expression analysis first and the index analysis second.

    In the fifth stage of the process, you first view the NUSIs recommended by the index analysis:

      SELECT queryname, indexddl
      FROM qcd.indexrecommendations AS ir,
           qcd.workload AS wl,
           qcd.query AS q
      WHERE ir.workloadid = wl.workloadid
      AND ir.queryid = q.queryid
      AND workloadname = 'MyWorkload'
      AND ir.indexnametag = 'MyIndexes';

    Assume the following output from this query:

      queryname indexddl
    --------- --------
      Q2 CREATE INDEX (product_category) ON Products

    Next you view the partitioning expression recommended by the partition analysis.

      SELECT queryname, expressiontext
      FROM qcd.partitionrecommendations AS pr,
           qcd.workload AS wl,
           qcd.query AS q
      WHERE pr.workloadid = wl.workloadid
      AND pr.queryid = q.queryid
      AND workloadname = 'MyWorkload'
      AND pr.resultnametag = 'MyPPIs';

    Assume the following output from this query. The formatting of the output is arbitrary and might not correspond with the actual output generated by your system.

      queryname expressiontext
      --------- --------------
      Q1        PARTITION BY RANGE_N(transaction_date
                BETWEEN DATE '2000-01-01'
                AND DATE '2006-06-30'
                EACH INTERVAL '7' DAY);

    In the sixth stage of the process, you view the estimated query cost improvements when using the recommended partitioning expression for this workload.

      SELECT queryname, originalcost, newcost
      FROM qcd.partitionrecommendations AS pr,
           qcd.workload AS wl,
           qcd.query AS q
      WHERE pr.workloadid = wl.workloadid
      AND pr.queryid = q.queryid
      AND workloadname = 'MyWorkload'
      AND pr.resultnametag = 'MyPPIs';

    Assume the following output from this query:

      queryname   originalcost    newcost
      --------- ------------ ------------
      Q1          4.530000E 004   2.580000E 002

    In the seventh stage of the process, you validate the recommendations made by the analysis.

    You can optionally validate the recommendations on your production system.

  • Stage 7a: Turn on Validation mode for the current session:
  • DIAGNOSTIC “VALIDATE INDEX” ON FOR SESSION:
  • Stage 7b, part 1: Simulate creation of the recommended NUSI:
  • CREATE INDEX (product_category) ON Products;
  • Stage 7b, part 2: Simulate altering the table definition for the recommended row partitioning.
  • ALTER TABLE sales_history
    MODIFY PRIMARY INDEX PARTITION BY RANGE_N(transaction_date
                                      BETWEEN DATE '2000-01-01'
                                      AND DATE '2006-06-30'
                                      EACH INTERVAL '7' DAY);

    Note that this ALTER TABLE request is not valid except when you are in Validation mode.

  • Stage 7c, part 1: Simulate collecting statistics on the recommended NUSI:
  • COLLECT STATISTICS FOR SAMPLE 100 PERCENT
    INTO qcd ON sales_history COLUMN(transaction_date);

    The system collects statistics on the underlying column data.

  • Stage 7c, part 2: Simulate collecting statistics on the recommended partitioning column:
  • COLLECT STATISTICS ON sales_history COLUMN(transaction_date);
  • Stage 7d: Capture the new query plans for the queries in the workload:
  • INSERT EXPLAIN INTO qcd AS q1_validate
    SELECT product_code, SUM(quantity_sold)
    FROM sales_history
    WHERE transaction_date >= DATE '2006-01-01'
    GROUP BY product_code;
    INSERT EXPLAIN INTO qcd AS q2_validate
    SELECT SUM(quantity_avail), AVG(quantity_avail),
            MIN(quantity_avail)
    FROM products
    WHERE product_category = ‘Laptop’
  • Stage 7e: Turn off validation mode:
  • DIAGNOSTIC “VALIDATE INDEX” NOT ON FOR SESSION;
  • Stage 7f, part 1: Examine the steps in the captured plan for query Q1 to see if row partition elimination occurred. Note that you can perform a similar query for Q2 as well. That query is omitted from this example because it is essentially redundant.
  • SELECT steptext
    FROM qcd.querystepsview
    WHERE QueryId =
                   (SELECT MAX(QueryId)
                    FROM qcd.query
                    WHERE queryname = ‘q1_validate’)
    ORDER BY stepnum;

    Assume the following partial output from this query:

    steptext

    --------

    3) We do an all-AMPs SUM step to aggregate from 26 partitions of “DFES.SalesHistory with a condition of ("DFES.SalesHistory.transaction_date >= DATE '2006-01-01’)”

  • Stage 7f, part 2: Examine the step costs in the captured plan for Q1 to see if costs are reduced.
  • You can perform a similar query for Q2 as well. That query is omitted from this example because it is essentially redundant.

    SELECT stepnum, stepkind, cost
    FROM qcd.querystepsview
    WHERE cost > 0
    AND QueryId =
                 (SELECT MAX(QueryId)
                  FROM qcd.query
                  WHERE queryname = ‘q1_validate’)
    ORDER BY stepnum;

    Retrieve the recommended SQL DDL to recreate the table with the recommended partitioning expression.

    SELECT recreatetext
    FROM qcd.partitionrecommendations AS pr,
         qcd.workload AS wl,
         qcd.query AS q
    WHERE pr.workloadid = wl.workloadid
    AND pr.queryid = q.queryid
    AND workloadname = 'MyWorkload'
    AND pr.resultnametag = 'MyPPIs';

    Assume this request returns the following result. The formatting of this output is arbitrary and might not correspond with the actual output generated by your system.

    recreatetext
    ------------
     
      CREATE TABLE sales_history AS (
        SELECT *
        FROM sales_history)
      WITH NO DATA;
      ALTER TABLE sales_history
        MODIFY PRIMARY INDEX PARTITION BY
        RANGE_N(transaction_date
                 BETWEEN DATE '1997-01-01'
                 AND DATE '2006-12-31'
                 EACH INTERVAL '1' DAY);
      INSERT INTO PPI_sales_history
        SELECT *
        FROM sales_history;
      DROP TABLE sales_history;
      RENAME TABLE PPI_sales_history TO sales_history;