Teradata Package for R Function Reference | 17.00 - 17.00 - tdDerive - Teradata Package for R

Teradata® Package for R Function Reference

Product
Teradata Package for R
Release Number
17.00
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B700-4007-090K
Language
English (United States)

Description

The Derive transformation requires the free-form transformation be specified as a formula using the following operators, arguments, and functions:

+, -, **, *, /, %, (, ), x, y, z, abs, exp, ln, log, sqrt


The arguments x, y, and z can only assume the value of an input column.
An implied multiply operator is automatically inserted when a number, argument (x, y, z), or parenthesis is immediately followed by an argument or parenthesis.
For example,
4x means 4x, xy means xy, and x(x+1) is equivalent to x*(x+1). An example formula for the quadratic equation is below.

formula="(-y+sqrt(y**2-4xz))/(2x)"

Note:

  • Output of this function is passed to "derive" argument of td_transform_valib().

Usage

tdDerive(formula, columns, out.column, datatype=NULL, fillna=NULL)

Arguments

formula

Required Argument.
Specifies the free-form transformation required for tdDerive().
Arithmetic formula can be specified as string using following operators, arguments, and functions:

+, -, **, *, /, %, (, ), x, y, z, abs, exp, ln, log, sqrt

Types: character

columns

Required Argument.
Specifies the names of the columns to use for formula.
Types: character OR vector of Strings (character)

out.column

Required Argument.
Specifies the name of the output column.
Types: character

datatype

Optional Argument.
Specifies the name of the intended datatype of the output column.
Intended data types for the output column can be specified using the permitted strings below:

------------------------------------ ---------------------------------------
If intended SQL Data Type is Permitted Value to be passed is
------------------------------------ ---------------------------------------
bigint bigint
byteint byteint
char(n) char,n
date date
decimal(m,n) decimal,m,n
float float
integer integer
number(*) number
number(n) number,n
number(*,n) number,*,n
number(n,n) number,n,n
smallint smallint
time(p) time,p
timestamp(p) timestamp,p
varchar(n) varchar,n

Notes:

  1. Argument is ignored if "columns" argument is not used.

  2. char without a size is not supported.

  3. number(*) does not include the * in its datatype format.

Examples:

  1. If intended datatype for the output column is 'bigint', then pass string 'bigint' to the argument as shown below:
    datatype="bigint"

  2. If intended datatype for the output column is 'decimal(3,5)', then pass string 'decimal,3,5' to the argument as shown below:
    datatype="decimal,3,5"

Types: character

fillna

Optional Argument.
Specifies whether the null replacement/missing value treatment should be performed with derive or not. Output of tdFillNa() can be passed to this argument.
Note:

  • If the FillNa object is created with its arguments "columns" and "datatype", then values passed in tdFillNa() arguments are ignored. Only nullstyle information is captured from the same.

Types: tdFillNa

Value

An object of tdDerive class.

Examples

Notes:
# 1. To run any transformation, user needs to use td_transform_valib()
#    function.
# 2. To do so set option 'val.install.location' to the database name
#    where Vantage analytic library functions are installed.
# 3. Datasets used in these examples can be loaded using Vantage Analytic
#    Library installer.

# Get the current context/connection
con <- td_get_context()$connection

# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")


# Create object(s) of class "tbl_teradata".
sales <- tbl(con, "sales")
sales

# Example: Includes multiple derive transformations
#          Derive transformation 1 is done with 3 variables, x, y, z, to
#          calculate the total sales for the first quarter for each account.
#          with td_transform_valib() function, when null
#          replacement is being done.

# Create tdFillNa object.
fn_1 <- tdFillNa(style="literal", value=0)

# Create tdDerive object.
dr_1 <- tdDerive(formula="x+y+z", columns=c("Jan", "Feb", "Mar"),
                 out.column="q1_sales", fillna=fn_1)

# Derive transformation 2 is done with 2 variables, x, y, to calculate
# the sale growth from the month of Jan to Feb.
fn_2 <- tdFillNa(style='median')
dr_2 <- tdDerive(formula="((y-x)/x)*100", columns=c("Jan", "Feb"),
                 out.column="feb_growth", fillna=fn_2, datatype='bigint')

# Perform the derive transformation using td_transform_valib() function.
obj <- td_transform_valib(data=sales, derive=c(dr_1, dr_2),
                          key.columns="accounts")
obj$result