Cumulative Distribution - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 2ADS Generation

Product
Teradata Warehouse Miner
Release Number
5.4.4
Published
July 2017
Language
English (United States)
Last Update
2018-05-03
dita:mapPath
fcf1492702067123.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2301
Product Category
Software

Given a sort expression list, this Ordered Analytical function derives a new column indicating the cumulative distribution of a value over a set of rows when sorted by the specified sort expression list. The returned value is > 0 and ≤ 1. Similar to Percent Rank, Cumulative Distribution handles tie values differently by calculating the ratio of the position of the last tied value in a set of ties to the number of rows. When one or more Partition Columns are specified, the cumulative distribution is determined separately over the rows in each partition (the calculation is reset for each new partition).

Rows options are not available with the Cumulative Distribution function.

A Teradata Warehouse Miner enhancement to the Cumulative Distribution function is offered to optionally request that NULL values in any element of the sort expression list cause the row to be excluded in the ranking process.

When dragging a Cumulative Distribution function into a variable, the following tree element is created.

Variable Creation > Input > Variables: SQL Elements pane – Ordered Analytical Functions > Cumulative Distribution

Sort expressions can be built up in the Sort Expressions folder, Partition Columns can be built up in that folder. The enhancement to the Cumulative Distribution function to optionally request that NULL values in any element of the sort expression list cause the row to be excluded in the ranking process is enabled through the Properties panel. Double-click on Cumulative Distribution, or highlight it and click Properties to set the properties for this function.

Variable Creation > Input > Variables: SQL Elements pane - Ordered Analytical Functions > Cumulative Distribution Properties

The default is to Include NULL values in the analysis, but that can be disabled here.