Statistical Tests Overview | Vantage Analytics Library - Statistical Tests - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Vantage Analytics Library
Release Number
2.2.0
Published
March 2023
Language
English (United States)
Last Update
2024-01-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage

Statistical tests help determine whether the outcome of an experiment could have been accidental.

Vantage Analytics Library contains classical parametric and nonparametric statistical tests and more recently developed statistical tests. Using the groupby parameter, you can analyze data groups defined by selected variables with specific values, thereby running multiple tests simultaneously to produce a profile of customer data showing hidden clues about customer behavior.

Date types are not standard numeric data types and might not function properly in statistical tests.
Statistical inference draws conclusions about parameters of a statistical distribution. There are three principal approaches to statistical inference:
Approach Description
Bayesian estimation Given experimental outcome, infers conclusions from posterior judgments about parameters.
Likelihood Given experimental outcome, infers conclusions from likelihood function of parameters.
Hypothesis testing Uses either of the following:
  • Nonparametric inference

    Estimators about distribution function are independent of its mathematical form.

  • Parametric inference

    Estimators about distribution function assume particular mathematical form, usually normal distribution. Parametric tests based on sampling distribution of particular statistic predict distribution of statistic in multiple equal-size samples.

All statistical tests in Analytics Library use hypothesis testing. They belong to the classes in the following table.

Each class has many variants, some of which are named for their originators. Tests with multiple originators may have multiple names. Tests can be applied to one, two, or multiple samples of data. The specific hypothesis of the test may be two-tailed, upper-tailed or lower-tailed.

Hypothesis Test Class Test Names
Parametric Tests
  • Two-Sample T-Test for Equal Means
    • Paired
    • Unpaired
    • Unpaired with Indicator
  • N-Way F-test:
    • F-Test/Analysis of Variance (One-way, with samples of equal or unequal size)
    • F-Test/Analysis of Variance (Two-way, with samples of equal size)
    • F-Test/Analysis of Variance (Three-way, with samples of equal size)
  • F-Test/Analysis of Variance (Two-way, with samples of unequal size)
Nonparametric Binomial Tests
  • Binomial/Z-Test
  • Binomial Sign Test
Nonparametric Kolmogorov-Smirnov Tests
  • Kolmogorov-Smirnov Test (One Sample)
  • Lilliefors Test
  • Shapiro-Wilk Test
  • D’Agostino and Pearson Test
  • Smirnov Test
Nonparametric Tests Based on Contingency
  • Chi-Squared Test
  • Median Test
Nonparametric Rank Tests
  • Mann-Whitney/Kruskal-Wallis Test
  • Mann-Whitney/Kruskal-Wallis Independent Tests
  • Wilcoxon Signed Ranks Test
  • Friedman Test with Kendall’s Coefficient of Concordance & Spearman’s Rho

Hypothesis tests depend on assumptions made in the context of the experiment. Be sure the tests are valid in the context of the data to be analyzed. For example, is it a fair assumption that the variables are normally distributed? The choice of test depends on the answer to this question. An inappropriate test can reject or accept the null hypothesis incorrectly, causing false alarms or misses, respectively.

Many statistical test functions cannot analyze columns defined with the attribute GENERATED AS IDENTITY.