TD_OutlierFilterFit Syntax Elements - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905
TargetColumns
Specify the names of the numeric InputTable columns for which to compute metrics.
GroupColumns
[Optional] Specify the name of the InputTable column by which to group the input data.
Default behavior: Function does not group input data.
OutlierMethod
[Optional] Specify one of these methods for filtering outliers:
Method Values Outside This Range Are Outliers
percentile (default method) [min_value, max_value].
tukey [Q1 - k*(Q3-Q1), Q1 + k*(Q3-Q1)]

where:

Q1 = 25th quartile of data

Q3 = 75th quartile of data

k = interquantile range multiplier (see IQRMultiplier)

carling Q2 ± c*(Q3-Q1)

where:

Q2 = median of data

Q1 = 25th quartile of data

Q3 = 75th quartile of data

c = (17.63*r - 23.64) / (7.74*r - 3.71)

r = count of rows in group_column if you specify GroupColumns, otherwise count of rows in InputTable

LowerPercentile
[Optional] Specify a lower range of percentile to use to detect whether the value is an outlier.
Value 0 to 1 is supported. For Tukey and Carling, use 0.25 as the lower percentile. The default value is 0.05.
UpperPercentile
[Optional] Specify a upper range of percentile to use to detect whether the value is an outlier.
Value 0 to 1 is supported. For Tukey and Carling, use 0.75 as the upper percentile. The default value is 0.95.
IQRMultiplier
[Optional] Specify interquartile range multiplier (IQR), k, for Tukey filtering.
The IQR is an estimate of the spread (dispersion) of the data in the target columns (IQR = |Q3-Q1|).
Use k = 1.5 for moderate outliers and k = 3.0 for serious outliers.
Default: 1.5
ReplacementValue
[Optional] Specify how to handle outliers:
Option Description
delete (default value) Do not copy row to output table.
null Copy row to output table, replacing each outlier with NULL.
median Copy row to output table, replacing each outlier with median value for its group.
replacement_value

(Must be numeric.)

Copy row to output table, replacing each outlier with replacement_value.
RemoveTail
[Optional] Specify whether to remove the upper tail, the lower tail, or both.
Default: both
PercentileMethod
[Optional] Specify either the PercentileCont or the PercentileDISC method for calculating the upper and lower percentiles of the input data values. The default value is PercentileDISC.