15.00 - VAR_SAMP - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)

VAR_SAMP

Purpose  

Returns the sample variance for the data points in value_expression.

Syntax  

where:

 

Syntax element …

Specifies …

ALL

to include all non-null values specified by value_expression, including duplicates, in the computation. This is the default.

DISTINCT

to exclude duplicates of value_expression from the computation.

value_expression

a numeric literal or column expression whose sample variance is to be computed.

The expression cannot contain ordered analytical or aggregate functions.

ANSI Compliance

This is ANSI SQL:2011 compliant.

Definition

The variance of a sample is a measure of dispersion from the mean of that sample. It is the square of the sample standard deviation.

The computation is more conservative than that for the population standard deviation to minimize the effect of outliers on the computed value.

Computation

The equation for computing VAR_SAMP is as follows:

where:

 

This variable …

Represents …

x

value_expression

When the sample used for the computation has fewer than two non-null data points,
VAR_SAMP returns NULL.

Division by zero results in NULL rather than an error.

Combination With Other Functions

VAR_SAMP can be combined with ordered analytical functions in a SELECT list, QUALIFY clause, or ORDER BY clause. For more information on ordered analytical functions, see Chapter 22: “Ordered Analytical / Window Aggregate Functions.”

VAR_SAMP cannot be combined with aggregate functions within the same SELECT list, QUALIFY clause, or ORDER BY clause.

GROUP BY Affects Report Breaks

VAR_SAMP operates differently depending on whether or not there is a GROUP BY clause in the SELECT statement.

 

IF the query …

THEN VAR_SAMP is reported for …

specifies a GROUP BY clause

each individual group.

does not specify a GROUP BY clause

all the rows in the sample.

Measuring the Variance of a Population

If your data represents the entire population for the variable, then use the VAR_POP function. For information, see “VAR_POP” on page 103.

As the sample size increases, the values for VAR_SAMP and VAR_POP approach the same number, but you should always use the more conservative VAR_SAMP calculation unless you are absolutely certain that your data constitutes the entire population for the variable.

Result Type and Attributes

The data type, format, and title for VAR_SAMP are as follows.

Data type: REAL

  • If the operand is character, the format is the default format for FLOAT.
  • If the operand is numeric, date, or interval, the format is the same format as x.
  • If the operand is UDT, the format is the format for the data type to which the UDT is implicitly cast.
  • Support for UDTs

    By default, Teradata Database performs implicit type conversion on a UDT argument that has an implicit cast that casts between the UDT and any of the following predefined types:

  • Numeric
  • Character
  • DATE
  • Interval
  • To define an implicit cast for a UDT, use the CREATE CAST statement and specify the AS ASSIGNMENT clause. For more information on CREATE CAST, see SQL Data Definition Language.

    Implicit type conversion of UDTs for system operators and functions, including VAR_SAMP, is a Teradata extension to the ANSI SQL standard. To disable this extension, set the DisableUDTImplCastForSysFuncOp field of the DBS Control Record to TRUE. For details, see Utilities: Volume 1 (A-K).

    For more information on implicit type conversion of UDTs, see Chapter 13: “Data Type Conversions.”

    VAR_SAMP Window Function

    For the VAR_SAMP window function that performs a group, cumulative, or moving computation, see “Window Aggregate Functions” on page 984.