Hypothesis-Test Mode Example: Omit GroupByColumns - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The example creates input tables t1 and t2 from a table of raw input, raw_normal_50_2, which contains data drawn from a normal distribution with a mean of 50 and a standard deviation of 2. Here are the first 10 rows:

raw_normal_50_2
price
48.0701
52.6426
48.6372
50.9832
50.523
52.1773
50.3103
48.4424
50.1352
50.1382
...

The following statements create tables t1 and t2:

CREATE MULTISET TABLE t1 AS (
  SELECT COUNT(*) AS group_size
  FROM raw_normal_50_2
  WHERE price IS NOT NULL
) WITH data;
CREATE MULTISET TABLE t2 AS (
  SELECT RANK() OVER (ORDER BY price) AS "rank", price 
  FROM raw_normal_50_2 
  WHERE price IS NOT NULL
) WITH data;

SQL Call

SELECT * FROM DistributionMatchReduce (
  ON DistributionMatchMultiInput (
		ON t2 AS InputTable PARTITION BY ANY
		ON t1 AS GroupStatistics DIMENSION
		USING
		TargetColumn('price')
		TESTS('KS', 'CvM', 'AD', 'CHISQ')
		DISTRIBUTIONS('NORMAL:49.97225,2.009698')
		MINGROUPSIZE(50)
		NumCell(10)
  ) PARTITION BY 1
) as dt;

Output

The reported p-value for each of the four tests is around 0.4, which does not rule out the null hypothesis that the data are consistent with a normal distribution with the specified mean and standard deviation.

In the output table column names, when 'a' and 'b' appear between digits, interpret them as comma (,) and period (.), respectively.

 group_size normal$49b97225a2b009698_ks_statistic normal$49b97225a2b009698_ks_p_value normal$49b97225a2b009698_cvm_statistic normal$49b97225a2b009698_cvm_p_value normal$49b97225a2b009698_ad_statistic normal$49b97225a2b009698_ad_p_value normal$49b97225a2b009698_chisq_statistic normal$49b97225a2b009698_chisq_p_value 
 ---------- ------------------------------------- ----------------------------------- -------------------------------------- ------------------------------------ ------------------------------------- ----------------------------------- ---------------------------------------- -------------------------------------- 
        400                   0.03195042908191681                 0.41180261969566345                     0.0556536540389061                   0.4307910203933716                    0.3761519193649292                  0.4102906286716461                        7.800000190734863                      0.350559800863266

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.