Parametric Tests | Vantage Analytics Library - Parametric Tests - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Vantage Analytics Library
Release Number
2.2.0
Published
March 2023
Language
English (United States)
Last Update
2024-01-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage

Parametric tests make assumptions about the data—for example, that observations are independent and normally distributed. You can verify that data is normally distributed with one of the Kolmogorov-Smirnov Tests.

Parametric tests output a p-value to compare to the threshold to determine whether to reject the null hypothesis.

Two-Sample T-Test for Equal Means

  • Paired

    There must be a one-to-one correspondence between the values in the two samples.

    The test assumes the mean differences between corresponding (paired) values are identically distributed normal random independent variables.

    The test determines whether the mean differences are significantly different from zero.

  • Unpaired

    There is no correspondence between the values in the two samples, which may be of equal or unequal size. The test selects the columns with the two unpaired datasets, some of which may be NULL.

    The test assumes the following:
    • The samples are independent of each other.
    • Within each sample, values are identically distributed normal random variables.
    • The mean differences between the samples are identically distributed normal random independent variables.
    • The variances of the samples may be equal (homoscedastic) or unequal (heteroscedastic).

    The null hypothesis is that the population means are equal.

  • Unpaired with Indicator

    Like unpaired, except instead of selecting the columns with the two unpaired datasets, the test selects the column of interest (dependent variable) and an indicator column. If the indicator variable is negative or zero, the test assigns the first variable to the first group. If the indicator variable is positive, the test assigns the first variable to the second group.

The following table shows the formulas that define the two-sample T-tests for unpaired data. (SQL calculates them differently.)

H0: μ1 = μ2
Ha: μ1 ≠ μ2
Test Statistic:
Two sample t test

where N1 and N2 are the sample sizes, "" and "" are the sample means, and s12 and s22 are the sample variances.

N-Way F-Test

  • F-Test/Analysis of Variance (One-way, with samples of equal or unequal size)
  • F-Test/Analysis of Variance (Two-way, with samples of equal size)
  • F-Test/Analysis of Variance (Three-way, with samples of equal size)

Use this F-Test on groups defined by the distinct values of the groupby columns. The groups must include two or more treatments.

Use this F-test (also called ANOVA) to determine if significant differences exist among treatment means or interactions. The null hypothesis is no. Acceptance of the null hypothesis implies factor levels and response are unrelated, so further analysis is unnecessary.

If the null hypothesis is rejected, examine the nature of the factor-level effects with a test such one of these:
  • Tukey's Method: Tests all possible pairwise differences of means.
  • Scheffe's Method: Tests all possible contrasts at the same time.
  • Bonferroni's Method: Tests or puts simultaneous confidence intervals around a preselected group of contrasts.

F-Test/Analysis of Variance (Two-way, with Samples of Unequal Size)

Use this F-Test on the entire dataset. You cannot specify groupby columns. The workaround is run this F-Test multiple times on pre-prepared datasets with group-by variables in each dataset as different constants. The datasets must include two or more treatments.

This test creates a temporary work table in the Result Database and drops it at the end of processing, even if you specify outputtablename.