Data in Partitioning Column of Window Specification and Resource Impact - Analytics Database - Teradata Vantage

SQL Functions, Expressions, and Predicates

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-01-12
dita:mapPath
obm1628111499646.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
kby1472250656485
lifecycle
latest
Product Category
Teradata Vantageā„¢

The columns specified in the PARTITION BY clause of a window specification determine the partitions over which the ordered analytical function executes. For example, the following query specifies the StoreID column in the PARTITION BY clause to compute the group sales sum for each store:

SELECT StoreID, SMonth, ProdID, Sales, 
SUM(Sales) OVER (PARTITION BY StoreID)
FROM sales_tbl;

At execution time, Vantage moves all of the rows that fall into a partition to the same AMP. If a very large number of rows fall into the same partition, the AMP can run out of spool space. For example, if the sales_tbl table in the preceding query has millions or billions of rows, and the StoreID column contains only a few distinct values, an enormous number of rows are going to fall into the same partition, potentially resulting in out-of-spool errors.

To avoid this problem, examine the data in the columns of the PARTITION BY clause. If necessary, rewrite the query to include additional columns in the PARTITION BY clause to create smaller partitions that Vantage can distribute more evenly among the AMPs. For example, the preceding query can be rewritten to compute the group sales sum for each store for each month:

SELECT StoreID, SMonth, ProdID, Sales, 
SUM(Sales) OVER (PARTITION BY StoreID, SMonth)
FROM sales_tbl;