Queries with IN and NOT IN predicates can include large numbers of constant values. (In both cases, the list of values is called an IN-list.) Big data applications can produce very large lists (with up to 100,000 or more items) as part of analytics queries. This feature allows the Optimizer to convert large IN-lists into spools during the query rewrite phase of query processing, which can then participate in the join planning with other relations in the query.
Benefits
- By eliminating optimizer processing related to these long-list predicates, parsing time is generally shortened, and performance is improved for queries with these lists.
- The system capacity for handling large IN-lists has been increased.
- New DBS Control Performance fields (InListRewriteThreshold, MaxNumInListRewrite, and InListRewriteOption) allow you to tune how and when IN-list rewrites happen.
Considerations
- In rare cases, IN-list rewriting may not be appropriate. For example:
- IN-list with consecutive values might be simplified by Predicate Simplification to single range: x in ( 1,2,3,…., 400000) => x between 1 and 400000
- A large query might be unsatisfiable if the IN-list is conjuncted with another predicate, such as
x>10000 and x in (1, 3, 5,…, 9001)
Additional Information
- For more information on IN-list processing, see Teradata Vantage™ SQL Request and Transaction Processing, B035-1142.
- For more information on DBS Control fields, see Teradata Vantage™ - Database Utilities , B035-1102 .