5.4.5 - Shapiro-Wilk Test - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Teradata Warehouse Miner
Release Number
February 2018
English (United States)
Last Update

The Shapiro-Wilk W test is designed to detect departures from normality without requiring that the mean or variance of the hypothesized normal distribution be specified in advance. It is considered to be one of the best omnibus tests of normality. The function is based on the approximations and code given by Royston (1982a, b). It can be used in samples as large as 2,000 or as small as 3. Royston (1982b) gives approximations and tabled values that can be used to compute the coefficients, and obtains the significance level of the W statistic. Small values of W are evidence of departure from normality. This test has done very well in comparison studies with other goodness of fit tests.

In general, either the Shapiro-Wilk or D'Agostino-Pearson test is a powerful overall test for normality. As omnibus tests, however, they will not indicate the type of nonnormality, e.g. whether the distribution is skewed as opposed to heavy-tailed (or both). Examination of the calculated skewness and kurtosis, and of the histogram, boxplot, and normal probability plot for the data may provide clues as to why the data failed the Shapiro-Wilk or D'Agostino-Pearson test.

The standard algorithm for the Shapiro-Wilk test only applies to sample sizes from 3 to 2000. For larger sample sizes, a different normality test should be used. The test statistic is based on the Kolmogorov-Smirnov statistic for a normal distribution with the same mean and variance as the sample mean and variance.