Distribution Matching Best-Match Mode Example | Teradata Vantage - Best-Match Mode Example: DOUBLE PRECISION Input - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The example creates input tables t_1 and t_2 from factory_7, the Hypothesis-Test Mode Example: Omit GroupByColumns input table.

The following statements create tables t_1 and t_2:

CREATE MULTISET TABLE t_1 AS (
  SELECT  product, 
         COUNT(*) AS group_size, 
         AVG(mttf) AS mean, 
         STDDEV_SAMP(mttf) AS sd,
         CASE
           WHEN MIN(mttf) > 0 
           THEN AVG(
             LN(
               CASE WHEN mttf > 0 
               THEN mttf 
               ELSE 1 
               END
             )
         )
         ELSE 0
         END AS mean_of_ln,
         CASE
           WHEN MIN(mttf) > 0 
           THEN STDDEV_SAMP(
             LN(
               CASE WHEN mttf > 0 
               THEN mttf 
               ELSE 1 
               END
             )
         )
         ELSE -1
         END AS sd_of_ln,
       MAX(mttf) AS "maximum", MIN(mttf) AS "minimum"
  FROM factory_7
  WHERE mttf IS NOT NULL
GROUP BY product) WITH data;
CREATE MULTISET TABLE t_2 AS (
  SELECT RANK() OVER (PARTITION BY product ORDER BY mttf) AS "rank", product, mttf
  FROM factory_7 
  WHERE mttf IS NOT NULL
) WITH data;

SQL Call

SELECT * FROM DistributionMatchReduce (
  ON  DistributionMatchMultiInput (
    ON t_2 AS InputTable PARTITION BY ANY
    ON t_1 AS GroupStatistics DIMENSION
    USING
    TargetColumn ('mttf')
    Tests ('KS', 'AD')
    GroupByColumns ('product')
    MinGroupSize (50)
  ) PARTITION BY product
  USING "Top"(2)
)AS dt ;

Output

 product group_size best_match_ks_top1                                 p_value_ks_top1    best_match_ks_top2                              p_value_ks_top2       best_match_ad_top1                         p_value_ad_top1       best_match_ad_top2                              p_value_ad_top2       
 ------- ---------- -------------------------------------------------- ------------------ ----------------------------------------------- --------------------- ------------------------------------------ --------------------- ----------------------------------------------- --------------------- 
 A             2999 gamma:4508.589149512046,2.2181356024550047         0.7658938765525818 lognormal:9.210295677185059,0.01492296066135168    0.7159204483032227 gamma:4508.589149512046,2.2181356024550047    0.6876224875450134 lognormal:9.210295677185059,0.01492296066135168    0.6386105418205261
 D             3000 uniformcontinuous:9500.6201171875,10499.7998046875 0.9964507818222046 beta:10,1                                       1.3535959908494988E-8 beta:10,1                                  2.0000000233721948E-7 weibull:1,1                                     2.0000000233721948E-7

Download a zip file of all examples and a SQL script file that creates their input tables.