16.10 - COUNT DISTINCT Performance Improvement - Teradata Database

Teradata Database Release Summary

prodname
Teradata Database
vrm_release
16.10
created_date
June 2017
category
Release Notes
featnum
B035-1098-161K

Teradata Database now processes single or multiple instances of COUNT DISTINCT, common in tactical queries, more efficiently. Release 16.00 processes multiple count distinct operations in fewer steps. EXPLAIN text has been modified to differentiate aggregate steps that combine aggregate and DISTINCT operations by indicating the number of distinct and non-distinct aggregates that would be processed by the SUM step.

MINIMUM DISTINCT and MAXIMUM DISTINCT operations are processed as a MINIMUM DISTINCT or a MAXIMUM DISTINCT non-distinct aggregate. This methodology also enables query processing with a reduced number of steps, and when specified with COUNT DISTINCT, these operations are processed by using the multi-distinct aggregate SUM as a non-distinct aggregate.

Benefits

The primary benefit of this feature is improved performance of queries that use COUNT DISTINCT syntax.

Considerations

The processing of the COUNT DISTINCT SUM step aggregate cache is now more sensitive to optimizer estimations, so we strongly recommend that you take the following actions:
  • Collect statistics on the anticipated GROUP BY values.
  • If possible, also collect statistics on the distinct expressions.
Collecting these statistics helps the Optimizer to plan the execution of the COUNT DISTINCT aggregation step with the correct set of resource allocations for optimal performance.

Additional Information

For more information on performance improvements for COUNT DISTINCT operations, see Teradata® Database SQL Request and Transaction Processing, B035-1142.