15.00 - CORR - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)
Last Update
2018-09-24

CORR

Purpose  

Returns the Pearson product moment correlation coefficient of its arguments for all non-null data point pairs.

Syntax  

where:

 

Syntax element …

Specifies …

value_expression_2

a numeric expression to be correlated with a second numeric expression.

The expressions cannot contain any ordered analytical or aggregate functions.

value_expression_1

ANSI Compliance

This is ANSI SQL:2011 compliant.

Definition

The Pearson product-moment correlation coefficient is a measure of the linear association between variables. The boundary on the computed coefficient ranges from -1.00 to +1.00.

Note that high correlation does not imply a causal relationship between the variables.

The following table indicates the meaning of four extreme values for the coefficient of correlation between two variables.

 

IF the correlation coefficient has this value …

THEN the association between the variables …

-1.00

is perfectly linear, but inverse.

As the value for y varies, the value for x varies identically in the opposite direction.

0

does not exist and they are said to be uncorrelated.

+1.00

is perfectly linear.

As the value for y varies, the value for x varies identically in the same direction.

NULL

cannot be measured because there are no non‑null data point pairs in the data used for the computation.

Computation

The equation for computing CORR is defined as follows:

where:

 

This variable …

Represents …

x

value_expression_2

y

value_expression_1

Division by zero results in NULL rather than an error.

Result Type and Attributes

The data type, format, and title for CORR(y, x) are as follows.

 

Data Type

Format

Title

REAL

the default format for DECIMAL(7,6)

CORR(y,x)

For an explanation of the formatting characters in the format, see “Data Type Formats and Format Phrases” in SQL Data Types and Literals.

Support for UDTs

By default, Teradata Database performs implicit type conversion on UDT arguments that have implicit casts that cast between the UDTs and any of the following predefined types:

  • Numeric
  • Character
  • DATE
  • Interval
  • To define an implicit cast for a UDT, use the CREATE CAST statement and specify the AS ASSIGNMENT clause. For more information on CREATE CAST, see SQL Data Definition Language.

    Implicit type conversion of UDTs for system operators and functions, including CORR, is a Teradata extension to the ANSI SQL standard. To disable this extension, set the DisableUDTImplCastForSysFuncOp field of the DBS Control Record to TRUE. For details, see Utilities: Volume 1 (A-K).

    For more information on implicit type conversion of UDTs, see Chapter 13: “Data Type Conversions.”

    Combination With Other Functions

    CORR can be combined with ordered analytical functions in a SELECT list, QUALIFY clause, or ORDER BY clause. For information on ordered analytical functions, see Chapter 22: “Ordered Analytical / Window Aggregate Functions.”

    CORR cannot be combined with aggregate functions within the same SELECT list, QUALIFY clause, or ORDER BY clause.

    Example  

    This example uses the data from the HomeSales table.

    SalesPrice   NbrSold   Area
    ----------   -------   ---------
        160000       126   358711030
        180000       103   358711030
        200000        82   358711030
        220000        75   358711030
        240000        82   358711030
        260000        40   358711030
        280000        20   358711030

    Consider the following query.

       SELECT CAST (CORR(NbrSold,SalesPrice) AS DECIMAL (6,4)) 
       FROM HomeSales
       WHERE area = 358711030 
       AND SalesPrice Between 160000 AND 280000;
       
       CORR(NbrSold,SalesPrice)
       ------------------------
                         -.9543

    The result -.9543 suggests an inverse relationship between the variables. That is, for the area and sales price range specified in the query, the value for NbrSold increases as sales price decreases and decreases as sales price increases.

    CORR Window Function

    For the CORR window function that performs a group, cumulative, or moving computation, see “Window Aggregate Functions” on page 984.