Teradata Package for R Function Reference | 17.00 - 17.00 - td_matrix_valib - Teradata Package for R

Teradata® Package for R Function Reference

Product
Teradata Package for R
Release Number
17.00
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B700-4007-090K
Language
English (United States)

Description

Matrix builds an extended sum-of-squares-and-cross-products (ESSCP) matrix or other derived matrix type from an object of type tbl_teradata. Matrix does this with the help of Teradata CALCMATRIX table operator provided in Teradata Vantage. The purpose of building a matrix depends on the type of matrix built.
For example, when a correlation matrix is built, view it to determine the correlations or relationships between the various columns in the matrix.

Usage

td_matrix_valib(data, columns, ...)

Arguments

data

Required Argument.
Specifies the input data to build matrix from.
Types: tbl_teradata

columns

Required Argument.
Specifies the name(s) of the column(s) used in building one or more matrices. Occasionally, it can also accept permitted strings to specify all columns, or all numeric columns.
Note:

  Do not use the following column names, as these are reserved
  for use by the CALCMATRIX table operator:\cr
    'rownum', 'rowname', 'c', or 's'.\cr

Permitted Values:

  1. Name(s) of the columns in "data".

  2. Pre-defined strings:

    1. 'all' - all columns

    2. 'allnumeric' - all numeric columns

Types: character OR vector of Strings (character)

...

Specifies other arguments supported by the function as described in the 'Other Arguments' section.

Value

Function returns an object of class "td_matrix_valib" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using name: result.

Other Arguments

exclude.columns

Optional Argument.
Specifies the name(s) of the column(s) to exclude from the analysis, if a column specifier such as 'all', 'allnumeric' is used in the "columns" argument.
For convenience, when the "exclude_columns" argument is used, dependent variable and group by columns, if any, are automatically excluded as input columns and do not need to be included in this argument.
Types: character OR vector of Strings (character)

group.columns

Optional Argument.
Specifies the name(s) of the column(s) in input tbl_teradata to build a separate matrix for each combination. If specified, group by columns divide the input into parts, one for each combination of values in the group by columns. For each combination of values, a separate matrix is built, though they are all stored in the same output.
Note:
Do not use the following column names, as these are reserved for use by the CALCMATRIX table operator:
'rownum', 'rowname', 'c', or 's'.
Types: character OR vector of Strings (character)

matrix.output

Optional Argument.
Specifies the type of matrix output. Matrix output can either be returned as COLUMNS in an output tbl_teradata or as VARBYTE values, one per column, in a reduced output tbl_teradata.
Permitted Values: 'columns', 'varbyte'
Default Value: 'columns'
Types: character

type

Optional Argument.
Specifies the type of matrix to build.
Permitted Values:

  1. 'SSCP' - sum-of-squares-and-cross-products matrix

  2. 'ESSCP' - Extended-sum-of-squares-and-cross-products matrix

  3. 'CSSCP' - Corrected-sum-of-squares-and-cross-products matrix

  4. 'COV' - Covariance matrix

  5. 'COR' - Correlation matrix

Default Value: 'ESSCP'
Types: character

handle.nulls

Optional Argument.
Specifies a way to treat null values in selected columns. When set to IGNORE, the row that contains the NULL value in a selected column is omitted from processing. When set to ZERO, the NULL value is replaced with zero (0) in calculations.
Permitted Values: 'IGNORE', 'ZERO'
Default Value: 'IGNORE'
Types: character

filter

Optional Argument.
Specifies the clause to filter rows selected for building the matrix.
For example,
filter = "cust_id > 0"
Types: character

Examples

# Notes:
#   1. To execute Vantage Analytic Library functions, set options 'val.install.location' to 
#      the database name where Vantage analytic library functions are installed.
#   2. Datasets used in these examples can be loaded using Vantage Analytic Library installer.

# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")

# Get remote data source connection.
con <- td_get_context()$connection

# Create an object of class "tbl_teradata".
df <- tbl(con, "customer")
print(df)

# Example 1: Build a 3-by-3 ESSCP matrix on input columns 'age', 'years_with_bank', 
#            and 'nbr_children'.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"))
# Print the results.
print(obj$result)

# Example 2: Build a 3-by-3 CSSCP matrix on input columns 'age', 'years_with_bank', 
#            and 'nbr_children' with null handling, where NULLs are replaced with zero.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"), 
                       handle.nulls="zero", type="CSSCP")
# Print the results.
print(obj$result)

# Example 3: Build a 3-by-3 COR matrix by limiting the input data by filtering rows.
#            Matrix is built on input columns 'age', 'years_with_bank', and 'nbr_children'.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"),
                       filter="nbr_children > 1", type="COR")
# Print the results.
print(obj$result)

# Example 4: Build two 3-by-3 COV matrices by grouping data on "gender" column.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"),
                       group.columns="gender", type="COV")
# Print the results.
print(obj$result)