OutlierFilter Example: OutlierMethod ('percentile'), ReplacementValue ('null') - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The InputTable has a time series of atmospheric pressure readings (in mbar) for five cities.

InputTable: ville_pressuredata
sn city period pressure_mbar
1 Asheville 2010-01-01 00:00:00 1020.5
2 Asheville 2010-01-01 01:00:00 9000
3 Asheville 2010-01-01 02:00:00 1020
4 Asheville 2010-01-01 03:00:00 10000
5 Asheville 2010-01-01 04:00:00 1020.2
6 Asheville 2010-01-01 05:00:00 1020
7 Asheville 2010-01-01 06:00:00 1020.3
8 Asheville 2010-01-01 07:00:00 1020.8
9 Asheville 2010-01-01 08:00:00 1020.3
10 Asheville 2010-01-01 09:00:00 1020.7
... ... ... ...
25 Greenville 2010-01-01 00:00:00 1020.6
26 Greenville 2010-01-01 01:00:00 9000
27 Greenville 2010-01-01 02:00:00 1020.1
28 Greenville 2010-01-01 03:00:00 10000
29 Greenville 2010-01-01 04:00:00 1020.2
30 Greenville 2010-01-01 05:00:00 1020
... ... ... ...
49 Brownsville 2010-01-01 00:00:00 1020.5
50 Brownsville 2010-01-01 01:00:00 9000
51 Brownsville 2010-01-01 02:00:00 1020
52 Brownsville 2010-01-01 03:00:00 10000
53 Brownsville 2010-01-01 04:00:00 1020.2
54 Brownsville 2010-01-01 05:00:00 1020
... ... ... ...
73 Nashville 2010-01-01 00:00:00 1020.4
74 Nashville 2010-01-01 01:00:00 9000
75 Nashville 2010-01-01 02:00:00 1019.9
76 Nashville 2010-01-01 03:00:00 10000
77 Nashville 2010-01-01 04:00:00 1020.1
78 Nashville 2010-01-01 05:00:00 1019.9
... ... ... ...
97 Knoxville 2010-01-01 00:00:00 1020.4
98 Knoxville 2010-01-01 01:00:00 9000
99 Knoxville 2010-01-01 02:00:00 1019.9
100 Knoxville 2010-01-01 03:00:00 10000
101 Knoxville 2010-01-01 04:00:00 1020
102 Knoxville 2010-01-01 05:00:00 1019.9
... ... ... ...

SQL Call

SELECT  * FROM OutlierFilter (
  ON ville_pressuredata AS InputTable
  OUT TABLE OutputTable (of_output1)
  USING
  TargetColumns ('pressure_mbar ')
  OutlierMethod ('percentile')
  PercentileThreshold (1,90)
  RemoveTail ('both')
  ReplacementValue ('null')
  GroupByColumns ('city')
) AS dt  ;

Output

The outlying values have been replaced by NULL.

 message                            
 ---------------------------------- 
 Output tables created successfully

SELECT * FROM of_output1 ORDER BY 1,2,3;
 sn  city        period                     pressure_mbar 
 --- ----------- -------------------------- ------------- 
   1 ashville    2010-01-01 00:00:00.000000        1020.5
   2 ashville    2010-01-01 01:00:00.000000          NULL
   3 ashville    2010-01-01 02:00:00.000000        1020.0
   4 ashville    2010-01-01 03:00:00.000000          NULL
   5 ashville    2010-01-01 04:00:00.000000        1020.2
   6 ashville    2010-01-01 05:00:00.000000        1020.0
   7 ashville    2010-01-01 06:00:00.000000        1020.3
   8 ashville    2010-01-01 07:00:00.000000        1020.8
   9 ashville    2010-01-01 08:00:00.000000        1021.3
  10 ashville    2010-01-01 09:00:00.000000        1021.7
  11 ashville    2010-01-01 10:00:00.000000        1022.1
  12 ashville    2010-01-01 11:00:00.000000        1022.0
  13 ashville    2010-01-01 12:00:00.000000        1021.1
  14 ashville    2010-01-01 13:00:00.000000        1020.0
  15 ashville    2010-01-01 14:00:00.000000        1019.3
  16 ashville    2010-01-01 15:00:00.000000        1019.0
  17 ashville    2010-01-01 16:00:00.000000        1019.2
  18 ashville    2010-01-01 17:00:00.000000        1019.6
  19 ashville    2010-01-01 18:00:00.000000        1020.1
  20 ashville    2010-01-01 19:00:00.000000        1020.6
  21 ashville    2010-01-01 20:00:00.000000        1020.9
  22 ashville    2010-01-01 21:00:00.000000        1021.1
  23 ashville    2010-01-01 22:00:00.000000        1021.0
  24 ashville    2010-01-01 23:00:00.000000        1020.9
  25 greenville  2010-01-01 00:00:00.000000        1020.6
  26 greenville  2010-01-01 01:00:00.000000          NULL
  27 greenville  2010-01-01 02:00:00.000000        1020.1
  28 greenville  2010-01-01 03:00:00.000000          NULL
  29 greenville  2010-01-01 04:00:00.000000        1020.2
  30 greenville  2010-01-01 05:00:00.000000        1020.0
  31 greenville  2010-01-01 06:00:00.000000        1020.4
  32 greenville  2010-01-01 07:00:00.000000        1020.8
  33 greenville  2010-01-01 08:00:00.000000        1021.3
  34 greenville  2010-01-01 09:00:00.000000        1021.7
  35 greenville  2010-01-01 10:00:00.000000        1022.0
  36 greenville  2010-01-01 11:00:00.000000        1021.9
  37 greenville  2010-01-01 12:00:00.000000        1021.1
  38 greenville  2010-01-01 13:00:00.000000        1020.0
  39 greenville  2010-01-01 14:00:00.000000        1019.3
  40 greenville  2010-01-01 15:00:00.000000        1019.0
  41 greenville  2010-01-01 16:00:00.000000        1019.2
  42 greenville  2010-01-01 17:00:00.000000        1019.6
  43 greenville  2010-01-01 18:00:00.000000        1020.1
  44 greenville  2010-01-01 19:00:00.000000        1020.6
  45 greenville  2010-01-01 20:00:00.000000        1020.9
  46 greenville  2010-01-01 21:00:00.000000        1021.0
  47 greenville  2010-01-01 22:00:00.000000        1020.9
  48 greenville  2010-01-01 23:00:00.000000        1020.9
  49 brownsville 2010-01-01 00:00:00.000000        1020.5
  50 brownsville 2010-01-01 01:00:00.000000          NULL
  51 brownsville 2010-01-01 02:00:00.000000        1020.0
  52 brownsville 2010-01-01 03:00:00.000000          NULL
  53 brownsville 2010-01-01 04:00:00.000000        1020.2
  54 brownsville 2010-01-01 05:00:00.000000        1020.0
  55 brownsville 2010-01-01 06:00:00.000000        1020.3
  56 brownsville 2010-01-01 07:00:00.000000        1020.8
  57 brownsville 2010-01-01 08:00:00.000000        1021.2
  58 brownsville 2010-01-01 09:00:00.000000        1021.6
  59 brownsville 2010-01-01 10:00:00.000000        1022.0
  60 brownsville 2010-01-01 11:00:00.000000        1021.9
  61 brownsville 2010-01-01 12:00:00.000000        1021.0
  62 brownsville 2010-01-01 13:00:00.000000        1019.9
  63 brownsville 2010-01-01 14:00:00.000000        1019.2
  64 brownsville 2010-01-01 15:00:00.000000        1019.0
  65 brownsville 2010-01-01 16:00:00.000000        1019.2
  66 brownsville 2010-01-01 17:00:00.000000        1019.6
  67 brownsville 2010-01-01 18:00:00.000000        1020.0
  68 brownsville 2010-01-01 19:00:00.000000        1020.5
  69 brownsville 2010-01-01 20:00:00.000000        1020.8
  70 brownsville 2010-01-01 21:00:00.000000        1020.9
  71 brownsville 2010-01-01 22:00:00.000000        1020.9
  72 brownsville 2010-01-01 23:00:00.000000        1020.8
  73 nashville   2010-01-01 00:00:00.000000        1020.4
  74 nashville   2010-01-01 01:00:00.000000          NULL
  75 nashville   2010-01-01 02:00:00.000000        1019.9
  76 nashville   2010-01-01 03:00:00.000000          NULL
  77 nashville   2010-01-01 04:00:00.000000        1020.1
  78 nashville   2010-01-01 05:00:00.000000        1019.9
  79 nashville   2010-01-01 06:00:00.000000        1020.2
  80 nashville   2010-01-01 07:00:00.000000        1020.6
  81 nashville   2010-01-01 08:00:00.000000        1021.1
  82 nashville   2010-01-01 09:00:00.000000        1021.5
  83 nashville   2010-01-01 10:00:00.000000        1021.9
  84 nashville   2010-01-01 11:00:00.000000        1021.8
  85 nashville   2010-01-01 12:00:00.000000        1021.0
  86 nashville   2010-01-01 13:00:00.000000        1019.8
  87 nashville   2010-01-01 14:00:00.000000        1019.2
  88 nashville   2010-01-01 15:00:00.000000        1018.9
  89 nashville   2010-01-01 16:00:00.000000        1019.1
  90 nashville   2010-01-01 17:00:00.000000        1019.5
  91 nashville   2010-01-01 18:00:00.000000        1019.9
  92 nashville   2010-01-01 19:00:00.000000        1020.4
  93 nashville   2010-01-01 20:00:00.000000        1020.7
  94 nashville   2010-01-01 21:00:00.000000        1020.9
  95 nashville   2010-01-01 22:00:00.000000        1020.8
  96 nashville   2010-01-01 23:00:00.000000        1020.7
  97 knoxville   2010-01-01 00:00:00.000000        1020.4
  98 knoxville   2010-01-01 01:00:00.000000          NULL
  99 knoxville   2010-01-01 02:00:00.000000        1019.9
 100 knoxville   2010-01-01 03:00:00.000000          NULL
 101 knoxville   2010-01-01 04:00:00.000000        1020.0
 102 knoxville   2010-01-01 05:00:00.000000        1019.9
 103 knoxville   2010-01-01 06:00:00.000000        1020.2
 104 knoxville   2010-01-01 07:00:00.000000        1020.6
 105 knoxville   2010-01-01 08:00:00.000000        1021.1
 106 knoxville   2010-01-01 09:00:00.000000        1021.5
 107 knoxville   2010-01-01 10:00:00.000000        1021.9
 108 knoxville   2010-01-01 11:00:00.000000        1021.8
 109 knoxville   2010-01-01 12:00:00.000000        1021.0
 110 knoxville   2010-01-01 13:00:00.000000        1019.9
 111 knoxville   2010-01-01 14:00:00.000000        1019.2
 112 knoxville   2010-01-01 15:00:00.000000        1018.9
 113 knoxville   2010-01-01 16:00:00.000000        1019.2
 114 knoxville   2010-01-01 17:00:00.000000        1019.6
 115 knoxville   2010-01-01 18:00:00.000000        1020.0
 116 knoxville   2010-01-01 19:00:00.000000        1020.5
 117 knoxville   2010-01-01 20:00:00.000000        1020.8
 118 knoxville   2010-01-01 21:00:00.000000        1020.9
 119 knoxville   2010-01-01 22:00:00.000000        1020.9
 120 knoxville   2010-01-01 23:00:00.000000        1020.8

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.