Using CASE Expressions to Optimize Query Design
Effects on Performance
The CASE expression can provide performance improvements:
CASE expressions help increase performance by returning multiple results in a single pass over the data, instead of needing to make multiple passes over the data and then use the client application to combine the results into a single report.
You can see performance improvements using the CASE expression as the following increase:
Valued and Searched CASE Expression
Use one of the following CASE expression forms to return alternate values based on search conditions.
This form… |
Tests… |
Example |
Valued |
an expression against possible values. |
Create a catalog entitled “Autumn Sale” that shows spring items marked 33% off and summer items marked 25% off. SELECT item_number, item_description,
|
Searched |
arbitrary expression(s). |
Repeat the query above, and mark down by 50% summer items with inventories of less than three. SELECT item_number, item_description, item_price AS 'Current//Price'
|
The following examples illustrate simple code substitution, virtual denormalization, and single pass examples that use the CASE expression.
Example : Simple Code Substitution
For example, instead of joining to a description table, use the CASE expression when the WHERE clause will not contain the case values, that is, the WHERE clause in the example below will not contain region_number.
It is important to not use CASE expressions in a view, when the queries that access the view will use the values in a case in the WHERE clause. When this occurs, the Optimizer is unable to evaluate any statistics and a less than optimum query plan could be executed.
SELECT CASE region_number
WHEN 1 THEN 'North'
WHEN 2 THEN 'South'
WHEN 3 THEN 'East'
ELSE 'West' END
,SUM(sales)
FROM sales_table
GROUP BY 1;
Example : Virtual Denormalization
ABC Telephone Company has a History table with n columns, plus call minutes and call type:
You want a summary of call minutes for each call type for each area code on a single line of output.
The standard solution is:
1 Do a GROUP BY on call_type and area code in the History table.
2 Do a self-join to get call_types 1 and 2 into the same row.
3 Do another self-join to get call_type 3 into the same row that contains all three call types.
In the classic denormalization solution, you would physically denormalize the History table by putting all three call types in the same row. However, a denormalized table requires more maintenance.
Instead, you can use the CASE expression to perform a virtual denormalization of the History table:
CREATE View DNV
AS SELECT Col1, ... , Coln
,CASE WHEN call_type = 1
THEN call_minutes END (NAMED Daytime_Minutes)
,CASE WHEN call_type = 2
THEN call_minutes END (NAMED Nighttime_Minutes)
,CASE WHEN call_type = 3
THEN call_minutes END (NAMED Weekend_Minutes)
FROM history;
Example : Single Pass
In this example, you want a report with five sales columns side by side:
You currently execute five separate SQL statements and combine the results in an application program.
SELECT SUM(sales) ... WHERE sales_date BETWEEN 060101 AND date; [Ytd]
SELECT SUM(sales) ... WHERE sales_date BETWEEN 061001 AND date; [Mtd]
SELECT SUM(sales) ... WHERE sales_date BETWEEN 050101 AND ADD_MONTHS (date, -12); [LyYtd]
SELECT SUM(sales) ... WHERE sales_date BETWEEN 051001 AND ADD_MONTHS (date, -12); [LyMtd]
SELECT SUM(sales) ... WHERE sales_date BETWEEN 051001 AND 051031; [LyCm]
Instead, you can use the CASE expression to execute one SQL statement that only makes one pass on the Sales_History table.
SELECT ...
SUM(CASE WHEN sales_date BETWEEN 060101 AND date THEN sales ELSE 0 END), [Ytd]
SUM(CASE WHEN sales_date BETWEEN 061001 AND date THEN sales ELSE 0 END), [Mtd]
SUM(CASE WHEN sales_date BETWEEN 050101 AND ADD_MONTHS (date, -12) THEN sales ELSE 0 END),[LyYtd]
SUM(CASE WHEN sales_date BETWEEN 051001 AND ADD_MONTHS (date, -12) THEN sales ELSE 0 END),[LyMtd]
SUM(CASE WHEN sales_date BETWEEN 051001 AND 051031 THEN sales ELSE 0 END), [LyCm]
FROM ...
WHERE sales_date BETWEEN 050101 AND date ...