Teradata Package for R Function Reference | 17.00 - 17.00 - tdZScore - Teradata Package for R

Teradata® Package for R Function Reference

Product
Teradata Package for R
Release Number
17.00
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B700-4007-090K
Language
English (United States)

Description

ZScore will allows rescaling of continuous numeric data in a more sophisticated way than a Rescaling transformation. In a Z-Score transformation, a numeric column is transformed into its Z-score based on the mean value and standard deviation of the data in the column. Z-Score transforms each column value into the number of standard deviations from the mean value of the column. This non-linear transformation is useful in data mining rather than in a linear Rescaling transformation. The Z-Score transformation supports both numeric and date type input data.
Note:

  • The object of this class is passed to "zscore" argument of td_transform_valib().

Usage

tdZScore(columns, datatype=NULL, fillna=NULL)

Arguments

columns

Required Argument.
Specifies name(s) of column(s) containing the input and output column names, where key is the name of the column to perform transformation on and value contains the name of the transformed output column. When only key is specified then output column name is the name of input column.

datatype

Optional Argument.
Specifies the name of the intended datatype of the output column.
Intended data types for the output column can be specified using the permitted strings below:

------------------------------------ ---------------------------------------
If intended SQL Data Type is Permitted Value to be passed is
------------------------------------ ---------------------------------------
bigint bigint
byteint byteint
char(n) char,n
date date
decimal(m,n) decimal,m,n
float float
integer integer
number(*) number
number(n) number,n
number(*,n) number,*,n
number(n,n) number,n,n
smallint smallint
time(p) time,p
timestamp(p) timestamp,p
varchar(n) varchar,n

Notes:

  1. Argument is ignored if "columns" argument is not used.

  2. char without a size is not supported.

  3. number(*) does not include the * in its datatype format.

Examples:

  1. If intended datatype for the output column is "bigint", then pass string "bigint" to the argument as shown below:
    datatype="bigint"

  2. If intended datatype for the output column is "decimal(3,5)", then pass string "decimal,3,5" to the argument as shown below:
    datatype="decimal,3,5"

Types: character

fillna

Optional Argument.
Specifies whether the null replacement/missing value treatment should be performed with sigmoid transformation or not. Output of tdFillNa() can be passed to this argument.
Note:

  • If the tdFillNa object is created with its arguments "columns" and "datatype", then values passed in tdFillNa() arguments are ignored. Only nullstyle information is captured from the same.

Types: tdFillNa

Value

An object of tdZScore class.

Examples

Notes:
# 1. To run any transformation, user needs to use td_transform_valib()
#    function.
# 2. To do so set option 'val.install.location' to the database name
#    where Vantage analytic library functions are installed.
# 3. Datasets used in these examples can be loaded using Vantage Analytic
#    Library installer.

# Get the current context/connection
con <- td_get_context()$connection

# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")

# Load example data.
loadExampleData("val_example", "sales")

# Create object(s) of class "tbl_teradata".
sales <- tbl(con, "sales")
sales

# Example 1: Rescaling with ZScore is carried out on 'Feb' column.
zs <- tdZScore(columns="Feb")

# Perform the ZScore transformation using td_transform_valib().
obj <- td_transform_valib(data=sales, zscore=zs)
obj$result

# Example 2: Rescaling with ZScore is carried out with multiple columns 'Jan'
#            and 'Apr' with null replacement using 'mode' style.
fn <- tdFillNa(style="mode")
zs <- tdZScore(columns=list("Jan"="january", "Apr"="april"), fillna=fn)

# Perform the ZScore transformation using td_transform_valib().
obj <- td_transform_valib(data=sales, zscore=zs, key.columns="accounts")
obj$result