TargetEncodingTransform
Description
The td_target_encoding_transform_sqle()
function takes the input data
and a fit data generated by the td_target_encoding_fit_sqle()
function
for encoding the categorical values.
Notes:
This function requires the UTF8 client character set.
This function does not support Pass-Through Characters (PTCs).
This function does not support KanjiSJIS or Graphic data types.
Usage considerations for td_target_encoding_transform_sqle
are:
Errors are generated in these cases: * When the td_fit_sqle data does not meet the criteria. * When category from input data is not found in the td_fit_sqle data and the "default_values" argument is also not used during
td_target_encoding_fit_sqle()
function.
Usage
td_target_encoding_transform_sqle (
data = NULL,
object = NULL,
accumulate = NULL,
...
)
Arguments
data |
Required Argument. |
object |
Required Argument. |
accumulate |
Optional Argument.
Types: character OR vector of Strings (character) |
... |
Specifies the generic keyword arguments SQLE functions accept. Below
are the generic keyword arguments: volatile: Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:
Note: |
Value
Function returns an object of class "td_target_encoding_transform_sqle"
which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator
using the name(s):result
Examples
# Get the current context/connection.
con <- td_get_context()$connection
# Load the example data.
loadExampleData("tdplyr_example", "titanic")
# Create tbl_teradata object.
data_input <- tbl(con, "titanic")
# Check the list of available analytic functions.
display_analytic_functions()
# Find the distinct values and counts for column 'sex' and 'embarked'.
categorical_summ <- td_categorical_summary_sqle(data=data_input,
target.columns = c("sex", "embarked"))
# Find the distinct count of 'sex' and 'embarked' in which only 2 column should be present
# name 'ColumnName' and 'CategoryCount'.
category_data <- categorical_summ$result
# Generates the required hyperparameters when "encoder_method" is 'CBM_BETA'.
TargetEncodingFit_out <- td_target_encoding_fit_sqle(data=data_input,
category.data=category_data,
encoder.method='CBM_BETA',
target.columns=c('sex', 'embarked'),
response.column='survived',
default.values=c(-1, -2))
# Example 1 : Encode the column 'sex' and 'embarked'.
TargetEncodingTransform_out <- td_target_encoding_transform_sqle(data=data_input,
object=TargetEncodingFit_out,
accumulate="passenger")
# Print the result.
print(TargetEncodingTransform_out$result)
# Alternatively use S3 transform function to run transform on the output of
# td_target_encoding_fit_sqle() function.
TargetEncodingTransform_out <- transform(TargetEncodingFit_out,
data=data_input,
accumulate="passenger")
# Print the result.
print(TargetEncodingTransform_out$result)