OutlierFilter Example 2: OutlierMethod ('MAD-median'), ReplacementValue ('median') - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

SQL Call

SELECT * FROM OutlierFilter ( 
  ON ville_pressuredata AS InputTable
  OUT TABLE OutputTable (of_output2)
  OUT TABLE OutlierTable (of_outlier2)
  USING
  TargetColumns ('pressure_mbar')
  ReplacementValue ('median')
  OutlierMethod ('MAD-median')
  MADScaleConstant (1.4826)
  MADThreshold (3)
  GroupByColumns ('city')
) AS dt;

Output

message
Output tables created successfully

This query returns the following table:

SELECT * FROM of_output2 ORDER BY 1;

The outlying values have been replaced with the median value for the group.

of_output2
sn city period pressure_mbar
1 Asheville 2010-01-01 00:00:00 1020.5
2 Asheville 2010-01-01 01:00:00 1020.5
3 Asheville 2010-01-01 02:00:00 1020
4 Asheville 2010-01-01 03:00:00 1020.5
5 Asheville 2010-01-01 04:00:00 1020.2
6 Asheville 2010-01-01 05:00:00 1020
7 Asheville 2010-01-01 06:00:00 1020.3
8 Asheville 2010-01-01 07:00:00 1020.8
9 Asheville 2010-01-01 08:00:00 1020.3
10 Asheville 2010-01-01 09:00:00 1020.7
11 Asheville 2010-01-01 10:00:00 1022.1
12 Asheville 2010-01-01 11:00:00 1022
13 Asheville 2010-01-01 12:00:00 1021.1
14 Asheville 2010-01-01 13:00:00 1020
15 Asheville 2010-01-01 14:00:00 1019.3
... ... ... ...

This query returns the following table:

SELECT * FROM of_outlier2 ORDER BY 1;
sn city period pressure_mbar
2 Asheville 2010-01-01 01:00:00 9000
4 Asheville 2010-01-01 03:00:00 10000
26 Greenville 2010-01-01 01:00:00 9000
28 Greenville 2010-01-01 03:00:00 10000
50 Brownsville 2010-01-01 01:00:00 9000
52 Brownsville 2010-01-01 03:00:00 10000
74 Nashville 2010-01-01 01:00:00 9000
76 Nashville 2010-01-01 03:00:00 10000
98 Knoxville 2010-01-01 01:00:00 9000
100 Knoxville 2010-01-01 03:00:00 10000

This query returns the following table:

SELECT * FROM of_outlier2 ORDER BY 1;
of_outlier2
sn city period pressure_mbar
2 Asheville 2010-01-01 01:00:00 9000
4 Asheville 2010-01-01 03:00:00 10000
26 Greenville 2010-01-01 01:00:00 9000
28 Greenville 2010-01-01 03:00:00 10000
50 Brownsville 2010-01-01 01:00:00 9000
52 Brownsville 2010-01-01 03:00:00 10000
74 Nashville 2010-01-01 01:00:00 9000
76 Nashville 2010-01-01 03:00:00 10000
98 Knoxville 2010-01-01 01:00:00 9000
    2010-01-01 03:00:00 10000