Purpose
This function builds a sum-of-squares-and-cross-products (SSCP) matrix or other derived matrix type from a table in a Teradata Database. It does this by generating and executing the SQL to call the Teradata CALCMATRIX table operator provided in Teradata beginning with the 14.10 release. The results are stored either in a table or as a result set returned to the user. The purpose in building a matrix depends on the type of matrix built. For example, when a correlation matrix is built, it can be viewed to determine the correlations or relationships between the various columns in the matrix.
Refer to the Teradata reference manual SQL Functions, Operators, Expressions, and Predicates, B035-1145, for more detailed information about the CALCMATRIX table operator.
Syntax
call twm.td_analyze('matrix','database=twm_database;tablename=twm_tablename;columns=c1,c2;outputdatabase=twm_out_db;outputtablename=twm_out_tbl;matrixtype={COR|COV|CSSCP|ESSCP|SSCP};where=c1>0;groupby=c3,c4;nullhandling={ZERO|IGNORE};matrixoutput={COLUMNS|VARBYTE}; ');Required Parameters
- columns
- The input columns comprising the created matrix or matrices. The columns must reside in the table named with the tablename parameter, residing in the database named with the database parameter.
- database
- The database containing the input table.
- tablename
- The input table to build a matrix from.
Optional Parameters
- groupby
- If specified, group by columns divide the input table into parts, one for each combination of values in the group by columns. For each combination of values a separate matrix is built, though they are all stored in the same output table or result data set. The group by columns must reside in the table named with the tablename parameter. The default case is no group by columns.
- matrixoutput {COLUMNS|VARBYTE}
- Matrix output can either be returned as COLUMNS in a table or as VARBYTE values, one per column, in a reduced output table. The default is output returned as COLUMNS.
- matrixtype
- The following types of matrix can be built with this function. If not specified, a sum-of-squares-and-cross-products (SSCP) matrix is built.
- SSCP = sum-of-squares-and-cross-products matrix (the default)
- ESSCP = Extended-sum-of-squares-and-cross-products matrix
- CSSCP = Corrected-sum-of-squares-and-cross-products matrix
- COV = Covariance matrix
- COR = Correlation matrix
- nullhandling {ZERO|IGNORE}
- If a value in a selected column is NULL, the row that contains the NULL value is by default omitted from processing (nullhandling=IGNORE), or the value may be replaced in calculations with zero through the use of this parameter (nullhandling=ZERO).
- outputdatabase
- The database that contains the resulting matrix output table.
- outputtablename
- The name of the output table representing one or more matrices. If group by columns are specified, there is a matrix for each combination of group by column values.
- overwrite
-
When overwrite is set to true (default), the output tables are dropped before creating new ones.
- where
- An optional conditional expression may be specified with this parameter, limiting the amount of data used to build the matrix.
Examples
In this example, input columns age, years_with_bank, and nbr_children are used to build a 3-by-3 SSCP matrix. No permanent output table is created, just a result data set that is returned to the user.
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children');
In this example, a null handling parameter is added so that NULL values are replaced with zeros.
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;nullhandling=zero');
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;outputdatabase=twm_results;outputtablename=matrix1b'); select * from twm_results.matrix1b order by 1;
In this example, an SQL WHERE clause is requested to limit the amount of data passed to the CALCMATRIX table operator.
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;where=nbr_children >1');
In this example, an SQL GROUP BY clause is added to the requested matrix so that two matrices are built and returned in the same result set (for gender=’F’ and gender=’M’).
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;groupby=gender');
In the following examples, one matrix of each type is created and returned to the user as a result set.
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;matrixtype=COR');
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;matrixtype=COV');
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;matrixtype=CSSCP');
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;matrixtype=ESSCP');
call twm.td_analyze('matrix','database=twm_source;tablename=twm_customer;columns=age,years_with_bank,nbr_children;matrixtype=SSCP');