Description
Matrix builds an extended sum-of-squares-and-cross-products (ESSCP) matrix or
other derived matrix type from an object of type tbl_teradata. Matrix does this with
the help of Teradata CALCMATRIX table operator provided in Teradata Vantage.
The purpose of building a matrix depends on the type of matrix built.
For example, when a correlation matrix is built, view it to determine the
correlations or relationships between the various columns in the matrix.
Usage
td_matrix_valib(data, columns, ...)
Arguments
data |
Required Argument. |
columns |
Required Argument. Do not use the following column names, as these are reserved for use by the CALCMATRIX table operator:\cr 'rownum', 'rowname', 'c', or 's'.\cr Permitted Values:
Types: character OR vector of Strings (character) |
... |
Specifies other arguments supported by the function as described in the 'Other Arguments' section. |
Value
Function returns an object of class "td_matrix_valib"
which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator
using name: result.
Other Arguments
exclude.columns
Optional Argument.
Specifies the name(s) of the column(s) to exclude from the
analysis, if a column specifier such as 'all', 'allnumeric'
is used in the "columns" argument.
For convenience, when the "exclude_columns" argument is used,
dependent variable and group by columns, if any, are automatically
excluded as input columns and do not need to be included in
this argument.
Types: character OR vector of Strings (character)
group.columns
Optional Argument.
Specifies the name(s) of the column(s) in input tbl_teradata
to build a separate matrix for each combination. If
specified, group by columns divide the input into parts, one
for each combination of values in the group by columns. For each
combination of values, a separate matrix is built, though they
are all stored in the same output.
Note:
Do not use the following column names, as these are reserved
for use by the CALCMATRIX table operator:
'rownum', 'rowname', 'c', or 's'.
Types: character OR vector of Strings (character)
matrix.output
Optional Argument.
Specifies the type of matrix output. Matrix output can either be
returned as COLUMNS in an output tbl_teradata or as VARBYTE values,
one per column, in a reduced output tbl_teradata.
Permitted Values: 'columns', 'varbyte'
Default Value: 'columns'
Types: character
type
Optional Argument.
Specifies the type of matrix to build.
Permitted Values:
'SSCP' - sum-of-squares-and-cross-products matrix
'ESSCP' - Extended-sum-of-squares-and-cross-products matrix
'CSSCP' - Corrected-sum-of-squares-and-cross-products matrix
'COV' - Covariance matrix
'COR' - Correlation matrix
Default Value: 'ESSCP'
Types: character
handle.nulls
Optional Argument.
Specifies a way to treat null values in selected columns.
When set to IGNORE, the row that contains the NULL value in
a selected column is omitted from processing. When set to ZERO,
the NULL value is replaced with zero (0) in calculations.
Permitted Values: 'IGNORE', 'ZERO'
Default Value: 'IGNORE'
Types: character
filter
Optional Argument.
Specifies the clause to filter rows selected for building the matrix.
For example,
filter = "cust_id > 0"
Types: character
Examples
# Notes:
# 1. To execute Vantage Analytic Library functions, set options 'val.install.location' to
# the database name where Vantage analytic library functions are installed.
# 2. Datasets used in these examples can be loaded using Vantage Analytic Library installer.
# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")
# Get remote data source connection.
con <- td_get_context()$connection
# Create an object of class "tbl_teradata".
df <- tbl(con, "customer")
print(df)
# Example 1: Build a 3-by-3 ESSCP matrix on input columns 'age', 'years_with_bank',
# and 'nbr_children'.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"))
# Print the results.
print(obj$result)
# Example 2: Build a 3-by-3 CSSCP matrix on input columns 'age', 'years_with_bank',
# and 'nbr_children' with null handling, where NULLs are replaced with zero.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"),
handle.nulls="zero", type="CSSCP")
# Print the results.
print(obj$result)
# Example 3: Build a 3-by-3 COR matrix by limiting the input data by filtering rows.
# Matrix is built on input columns 'age', 'years_with_bank', and 'nbr_children'.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"),
filter="nbr_children > 1", type="COR")
# Print the results.
print(obj$result)
# Example 4: Build two 3-by-3 COV matrices by grouping data on "gender" column.
obj <- td_matrix_valib(data=df, columns=c("age", "years_with_bank", "nbr_children"),
group.columns="gender", type="COV")
# Print the results.
print(obj$result)