Heuristics for Determining a Reasonable Answer Set
Introduction
Before you can judge the accuracy of the results returned from an outer join query, you need to estimate a reasonable guess about the likelihood that an answer set is correct. This section provides two simple methods for obtaining the approximate numbers of rows you should expect from a correctly structured outer join query.
Single Table Cardinality Estimate
This query is a single table SELECT that provides the cardinality of the customer table as restricted by the conditions used for the example outer join queries.
From the report returned by this query, we know how many rows should be returned by the outer join.
SELECT c.cust_num
FROM sampdb.customer c
WHERE c.district = 'K'
AND (c.service_type = 'ABC'
OR c.service_type = 'XYZ')
ORDER BY 1;
This query returns 18,034 rows.
Inner Join Cardinality Estimate
This query is an inner join that helps to explain the example outer join queries and their results. The query starts with the same customer rows found in “Single Table Cardinality Estimate” on page 284, but then matches them with revenue rows for the month of July, 1997.
SELECT c.custnum, b.monthly_revenue
FROM sampdb.customer AS c, sampdb2.revenue AS b
WHERE c.custnum = b.custnum
AND c.district = 'K'
AND b.data_year_month = 199707
AND (c.service_type = 'ABC'
OR c.service_type = 'XYZ')
ORDER BY 1;
The query returns 13,010 rows.
Note that all customer table rows are matched by a monthly_revenue table row.