Statistical Tests Overview | Vantage Analytics Library - Statistical Tests

Statistical Tests Overview | Vantage Analytics Library - Statistical Tests - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment

VantageCloud

VantageCore

Edition

Enterprise

IntelliFlex

Lake

VMware

Product

Vantage Analytics Library

Release Number

2.2.0

Published

March 2023

Language

English (United States)

Last Update

2024-01-02

dita:mapPath

ibw1595473364329.ditamap

dita:ditavalPath

iup1603985291876.ditaval

dita:id

zyl1473786378775

Product Category

Teradata Vantage

Statistical tests help determine whether the outcome of an experiment could have been accidental.

Vantage Analytics Library contains classical parametric and nonparametric statistical tests and more recently developed statistical tests. Using the groupby parameter, you can analyze data groups defined by selected variables with specific values, thereby running multiple tests simultaneously to produce a profile of customer data showing hidden clues about customer behavior.

Date types are not standard numeric data types and might not function properly in statistical tests.

Statistical inference draws conclusions about parameters of a statistical distribution. There are three principal approaches to statistical inference:

Approach	Description
Bayesian estimation	Given experimental outcome, infers conclusions from posterior judgments about parameters.
Likelihood	Given experimental outcome, infers conclusions from likelihood function of parameters.
Hypothesis testing	Uses either of the following: Nonparametric inference Estimators about distribution function are independent of its mathematical form. Parametric inference Estimators about distribution function assume particular mathematical form, usually normal distribution. Parametric tests based on sampling distribution of particular statistic predict distribution of statistic in multiple equal-size samples.

All statistical tests in Analytics Library use hypothesis testing. They belong to the classes in the following table.

Each class has many variants, some of which are named for their originators. Tests with multiple originators may have multiple names. Tests can be applied to one, two, or multiple samples of data. The specific hypothesis of the test may be two-tailed, upper-tailed or lower-tailed.

Hypothesis Test Class	Test Names
Parametric Tests	Two-Sample T-Test for Equal Means Paired Unpaired Unpaired with Indicator N-Way F-test: F-Test/Analysis of Variance (One-way, with samples of equal or unequal size) F-Test/Analysis of Variance (Two-way, with samples of equal size) F-Test/Analysis of Variance (Three-way, with samples of equal size) F-Test/Analysis of Variance (Two-way, with samples of unequal size)
Nonparametric Binomial Tests	Binomial/Z-Test Binomial Sign Test
Nonparametric Kolmogorov-Smirnov Tests	Kolmogorov-Smirnov Test (One Sample) Lilliefors Test Shapiro-Wilk Test D’Agostino and Pearson Test Smirnov Test
Nonparametric Tests Based on Contingency	Chi-Squared Test Median Test
Nonparametric Rank Tests	Mann-Whitney/Kruskal-Wallis Test Mann-Whitney/Kruskal-Wallis Independent Tests Wilcoxon Signed Ranks Test Friedman Test with Kendall’s Coefficient of Concordance & Spearman’s Rho

Hypothesis tests depend on assumptions made in the context of the experiment. Be sure the tests are valid in the context of the data to be analyzed. For example, is it a fair assumption that the variables are normally distributed? The choice of test depends on the answer to this question. An inappropriate test can reject or accept the null hypothesis incorrectly, causing false alarms or misses, respectively.

Many statistical test functions cannot analyze columns defined with the attribute GENERATED AS IDENTITY.