15.00 - Using CASE Expressions to Optimize Query Design - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)
Last Update
2018-09-24

Using CASE Expressions to Optimize Query Design

Effects on Performance

The CASE expression can provide performance improvements:

  • For multiple aggregates filtering distinct ranges of values. For example, total sales for several time periods.
  • When creating two-dimensional reports directly from Teradata Database. For example, balances in individual accounts held by all bank customers.
  • CASE expressions help increase performance by returning multiple results in a single pass over the data, instead of needing to make multiple passes over the data and then use the client application to combine the results into a single report.

    You can see performance improvements using the CASE expression as the following increase:

  • Number of queries against the same source table(s)
  • Volume of data in the source table
  • Valued and Searched CASE Expression

    Use one of the following CASE expression forms to return alternate values based on search conditions.

     

    This form…

    Tests…

    Example

    Valued

    an expression against possible values.

    Create a catalog entitled “Autumn Sale” that shows spring items marked 33% off and summer items marked 25% off.

    SELECT item_number, item_description,
    item_price AS 'Current//Price'
    , CASE item_season
    WHEN 'spring' THEN item_price *(1-.33)
    WHEN 'summer' THEN item_price *(1-.25)
    ELSE NULL
    END AS 'Sale//Price'
    FROM inventory_table;

    Searched

    arbitrary expression(s).

    Repeat the query above, and mark down by 50% summer items with inventories of less than three.

    SELECT item_number, item_description, item_price AS 'Current//Price'
    ,CASE
     WHEN item_season = 'summer' AND item_count < 3
    THEN item_price *(1-.50)
     WHEN item_season = 'summer' AND item_count >= 3
    THEN item_price *(1-.25)
    WHEN item_season = 'spring'
    THEN item_price *(1-.33)
    ELSE NULL
    END AS 'Sale//Price'
    FROM inventory_table
    WHERE item_season IN('spring' OR 'summer');

    The following examples illustrate simple code substitution, virtual denormalization, and single pass examples that use the CASE expression.

    Example : Simple Code Substitution

    For example, instead of joining to a description table, use the CASE expression when the WHERE clause will not contain the case values, that is, the WHERE clause in the example below will not contain region_number.

    It is important to not use CASE expressions in a view, when the queries that access the view will use the values in a case in the WHERE clause. When this occurs, the Optimizer is unable to evaluate any statistics and a less than optimum query plan could be executed.

       SELECT CASE region_number
       WHEN 1 THEN 'North'
       WHEN 2 THEN 'South'
       WHEN 3 THEN 'East'
             ELSE 'West' END
       ,SUM(sales)
       FROM sales_table
       GROUP BY 1;

    Example : Virtual Denormalization

    ABC Telephone Company has a History table with n columns, plus call minutes and call type:

  • 1 - daytime call
  • 2 - night-time call
  • 3 - weekend call
  • You want a summary of call minutes for each call type for each area code on a single line of output.

    The standard solution is:

    1 Do a GROUP BY on call_type and area code in the History table.

    2 Do a self-join to get call_types 1 and 2 into the same row.

    3 Do another self-join to get call_type 3 into the same row that contains all three call types.

    In the classic denormalization solution, you would physically denormalize the History table by putting all three call types in the same row. However, a denormalized table requires more maintenance.

    Instead, you can use the CASE expression to perform a virtual denormalization of the History table:

    CREATE View DNV
           AS SELECT Col1, ... , Coln
           ,CASE WHEN call_type = 1
             THEN call_minutes END (NAMED Daytime_Minutes)
           ,CASE WHEN call_type = 2
             THEN call_minutes END (NAMED Nighttime_Minutes) 
           ,CASE WHEN call_type = 3
             THEN call_minutes END (NAMED Weekend_Minutes) 
    FROM history;

    Example : Single Pass

    In this example, you want a report with five sales columns side by side:

  • Current Year, Year to Date (Ytd)
  • Current Year, Month to Date (Mtd)
  • Last Year, Year to Date (LyYtd)
  • Last Year, Month to Date (LyMtd)
  • Last Year, Current Month (LyCm)
  • You currently execute five separate SQL statements and combine the results in an application program.

    SELECT SUM(sales) ... WHERE sales_date BETWEEN 060101 AND date; [Ytd]
    SELECT SUM(sales) ... WHERE sales_date BETWEEN 061001 AND date; [Mtd]
    SELECT SUM(sales) ... WHERE sales_date BETWEEN 050101 AND ADD_MONTHS (date, -12); [LyYtd]
    SELECT SUM(sales) ... WHERE sales_date BETWEEN 051001 AND ADD_MONTHS (date, -12); [LyMtd]
    SELECT SUM(sales) ... WHERE sales_date BETWEEN 051001 AND 051031; [LyCm]

    Instead, you can use the CASE expression to execute one SQL statement that only makes one pass on the Sales_History table.

    SELECT ... 
          SUM(CASE WHEN sales_date BETWEEN 060101 AND date THEN sales ELSE 0 END),  [Ytd]
          SUM(CASE WHEN sales_date BETWEEN 061001 AND date THEN sales ELSE 0 END),  [Mtd]
          SUM(CASE WHEN sales_date BETWEEN 050101 AND ADD_MONTHS (date, -12) THEN sales ELSE 0 END),[LyYtd]
          SUM(CASE WHEN sales_date BETWEEN 051001 AND ADD_MONTHS (date, -12) THEN sales ELSE 0 END),[LyMtd]
          SUM(CASE WHEN sales_date BETWEEN 051001 AND 051031 THEN sales ELSE 0 END), [LyCm]
    FROM ...  
    WHERE sales_date BETWEEN 050101 AND date ...