Histogram Syntax Elements - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
OutputTable
Specify the name for the table that the function creates for the output.
AutoBin
[Optional] If you specify this syntax element, you must not specify CustomBinTable, CustomBinColumn, StartValue, BinSize, or EndValue.

Specify either the algorithm for selecting bin boundaries or the approximate number of bins to find. The number_of_bins must be a positive integer.

Sturges algorithm for calculating bin width can be written as:

w = r/(1 + log2n)

where w is the bin width, r is the range of the data values and n is the number of elements in the data set. Sturges algorithm performs best if the data is normally distributed and n is at least 30.

The Scott algorithm for calculating bin width can be written as:

w = 3.49s/(n1/3)

where w is the bin width, s is the standard deviation of the data values and n is the number of elements in the data set. The number of bins is r/w, where r is the range of the data values. The Scott algorithm performs best on normally distributed data.

CustomBinColumn
[Required if you specify CustomBinTable, disallowed otherwise.] If you specify this syntax element, you must not specify AutoBin, StartValue, BinSize, or EndValue.

Specify the name of the CustomBinTable column that contains the boundary values. This column must have a numeric SQL data type.

StartValue
[Optional] If you specify this syntax element, you must also specify BinSize and EndValue, and you must not specify AutoBin, CustomBinTable, or CustomBinColumn.

Specify the smallest value to use in binning.

EndValue
[Optional] If you specify this syntax element, you must also specify StartValue and BinSize, and you must not specify AutoBin, CustomBinTable, or CustomBinColumn.

Specify the largest value to use in binning.

BinSize
[Optional] If you specify this syntax element, you must also specify StartValue and EndValue, and you must not specify AutoBin, CustomBinTable, or CustomBinColumn.

Specify this syntax element only for equally sized bins. The bin_size is the width of each bin, a positive DOUBLE PRECISION value.

GroupByColumns
[Optional] Specify the names of the InputTable columns that contain the group values for binning. These columns cannot contain DOUBLE PRECISION values.
Inclusion
[Optional] Specify whether to include points on bin boundaries in the bin on the left or the bin on the right.
Default: 'left'
TargetColumn
Specify the name of the InputTable column for which to compute statistics. This column must have a numeric SQL data type.