Teradata Package for R Function Reference | 17.20 - ScaleFit - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for R
Release Number
17.20
Published
March 2024
Language
English (United States)
Last Update
2024-05-03
dita:id
TeradataR_FxRef_Enterprise_1720
Product Category
Teradata Vantage

ScaleFit

Description

td_scale_fit_sqle() function outputs statistics to input to td_scale_transform_sqle() function, which scales specified input tbl_teradata columns.

Usage

  td_scale_fit_sqle (
      data = NULL,
      target.columns = NULL,
      scale.method = NULL,
      miss.value = "KEEP",
      global.scale = FALSE,
      multiplier = '1',
      intercept = '0',
      ...
  )

Arguments

data

Required Argument.
Specifies the input tbl_teradata.
Types: tbl_teradata

target.columns

Required Argument.
Specifies the input tbl_teradata column(s) for which to output statistics.
The columns must contain numeric data in the range (-1e308, 1e308).
Types: character OR vector of Strings (character)

scale.method

Required Argument.
Specifies the scale method to be used for scaling. If one value is
provided, it applies to all target columns. If more than one value is specified, scale method values applies to target columns values in the order specified by the user.
td_scale_transform_sqle() function uses the location and scale values in the following formula to scale target column value X to scaled value X':
X' = intercept + multiplier * ((X - location)/scale) "intercept" and "multiplier" arguments determine intercept and multiplier.

In the table, Xmin, Xmax, and XMean are the minimum, maximum, and mean values of the target column.

---------------------------------------------------------------------------------
scale_method Description location scale
---------------------------------------------------------------------------------
MAXABS Maximum absolute value. 0 Maximum |X|
MEAN Mean. XMean 1
MIDRANGE Midrange. (Xmax+Xmin)/2 (Xmax-Xmin)/2
RANGE Range. Xmin Xmax-Xmin
RESCALE Rescale using specified lower bound, upper bound. See syntax after this table See table after RESCALE syntax See table after RESCALE syntax.
STD Standard deviation. XMean √(∑((Xi - Xmean)2)/ N) where N is count of valid values.
SUM Sum. 0 ΣX
USTD Unbiased standard deviation. XMean √(∑((Xi - Xmean)2)/ (N - 1)) where N is count of valid values.
----------------------------------------------------------------------------------------

RESCALE ( lb=lower_bound | ub=upper_bound | lb=lower_bound, ub=upper_bound )

----------------------------------------------------------------------------
location scale
----------------------------------------------------------------------------
Lower bound only Xmin - lower_bound 1
Upper bound only Xmax - upper_bound 1
Lower and upper bounds Xmin - (lower_bound/(upper_bound- lower_bound)) (Xmax - Xmin)/ (upper_bound- lower_bound)
----------------------------------------------------------------------------------

Permitted Values:

  • MAXABS

  • MEAN

  • MIDRANGE

  • RANGE

  • RESCALE

  • STD

  • SUM

  • USTD

Types: character OR vector of Strings (character)

miss.value

Optional Argument.
Specifies how to process NULL values in input.
Permitted Values:

  • KEEP: Keep NULL values.

  • ZERO: Replace each NULL value with zero.

  • LOCATION: Replace each NULL value with its location value.

Default Value: "KEEP"
Types: character

global.scale

Optional Argument.
Specifies whether all input columns are scaled to the same location
and scale. When set to FALSE, each input column is scaled separately.
Default Value: FALSE
Types: logical

multiplier

Optional Argument.
Specifies one or more multiplying factors(multiplier) to apply to the input data.
If only one multiplier is specified, it applies to all target columns.
If a list of multipliers is specified, each multiplier applies to the corresponding target column.
Default Value: "1"
Types: character OR vector of String (character)

intercept

Optional Argument.
Specifies one or more addition factors(intercept) incrementing the scaled results.
If only one intercept specified, it applies to all target columns.
If a list of intercepts is specified, each intercept applies to the corresponding target column.
The syntax of intercept is:
[-]number | min | mean | max where min, mean, and max are the global minimum, maximum, mean values in the corresponding columns. The function scales the values of min, mean, and max.
For example, if intercept is "- min" and multiplier is

according to this formula, where scaledmin is the scaled value:
X = -scaledmin + 1 * (X - location)/scale.
Default Value: "0"
Types: character OR vector of String (character)

...

Specifies the generic keyword arguments SQLE functions accept.
Below are the generic keyword arguments:

persist:
Optional Argument.
Specifies whether to persist the results of the function in a table or not.
When set to TRUE, results are persisted in a table; otherwise, results are garbage collected at the end of the session.
Default Value: FALSE
Types: logical

volatile:
Optional Argument.
Specifies whether to put the results of the function in a volatile table or not.
When set to TRUE, results are stored in a volatile table, otherwise not.
Default Value: FALSE
Types: logical

Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:

  • "<input.data.arg.name>.partition.column" accepts character OR vector of Strings (character) (Strings)

  • "<input.data.arg.name>.hash.column" accepts character OR vector of Strings (character) (Strings)

  • "<input.data.arg.name>.order.column" accepts character OR vector of Strings (character) (Strings)

  • "local.order.<input.data.arg.name>" accepts logical

Note:
These generic arguments are supported by tdplyr if the underlying SQL Engine function supports, else an exception is raised.

Value

Function returns an object of class "td_scale_fit_sqle" which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator using the name(s):

  1. output

  2. output.data

Examples

  
    
    # Get the current context/connection.
    con <- td_get_context()$connection
    
    # Load the example data.
    loadExampleData("scalemap_example", "scale_housing")
    
    # Create tbl_teradata.
    scaling_house <- tbl(con, "scale_housing")
    
    # Check the list of available analytic functions.
    display_analytic_functions()
    
    # Example 1: Create statistics to scale "lotsize" with respect to
    #            mean value of the column.
    fit_obj <- td_scale_fit_sqle(data=scaling_house,
                                 target.columns="lotsize",
                                 scale.method="MEAN",
                                 miss.value="KEEP",
                                 global.scale=FALSE,
                                 multiplier="1",
                                 intercept="0")
    
    # Print the result.
    print(fit_obj$result)
    print(fit_obj$output.data)