ColumnTransformer
Description
The td_column_transformer_sqle()
function transforms the input data columns in a single operation. Provide only the FIT tbl_teradata objects generated by the td_fit_sqle analytic functions, and the function runs all transformations
that user require in a single operation.
The function performs the following transformations:
Scale Transform
Bincode Transform
Function Transform
NonLinearCombine Transform
OutlierFilter Transform
PolynomialFeatures Transform
RowNormalize Transform
OrdinalEncoding Transform
OneHotEncoding Transform
SimpleImpute Transform
User must create the FIT tbl_teradata before using the function and must be provided in the same order as in the training data sequence to transform the dataset. The FIT tbl_teradata can have maximum of 128 columns.
Usage
td_column_transformer_sqle (
input.data = NULL,
bincode.fit.data = NULL,
function.fit.data = NULL,
nonlinearcombine.fit.data = NULL,
onehotencoding.fit.data = NULL,
ordinalencoding.fit.data = NULL,
outlierfilter.fit.data = NULL,
polynomialfeatures.fit.data = NULL,
rownormalize.fit.data = NULL,
scale.fit.data = NULL,
simpleimpute.fit.data = NULL,
fillrowid.column.name = NULL,
...
)
Arguments
input.data |
Required Argument. |
bincode.fit.data |
Optional Argument. |
function.fit.data |
Optional Argument. |
nonlinearcombine.fit.data |
Optional Argument.
which are out of sequence.
"nonlinearcombine_fit_data1", "nonlinearcombine_fit_data2", so on.. |
onehotencoding.fit.data |
Optional Argument. |
ordinalencoding.fit.data |
Optional Argument. |
outlierfilter.fit.data |
Optional Argument. |
polynomialfeatures.fit.data |
Optional Argument. |
rownormalize.fit.data |
Optional Argument. |
scale.fit.data |
Optional Argument. |
simpleimpute.fit.data |
Optional Argument. |
fillrowid.column.name |
Optional Argument. |
... |
Specifies the generic keyword arguments SQLE functions accept. Below volatile: Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:
Note: |
Value
Function returns an object of class "td_column_transformer_sqle"
which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator
using the name(s):result
Examples
# Get the current context/connection.
con <- td_get_context()$connection
# Load the example data.
loadExampleData("tdplyr_example", "titanic")
# Create tbl_teradata object.
titanic <- tbl(con, "titanic")
# Check the list of available analytic functions.
display_analytic_functions()
# Example 1: Perform Bincode transformation and OneHotEncoding
# transformation using td_bincode_fit_sqle() and
# td_column_transformer_sqle() function.
bin_code <- td_bincode_fit_sqle(data=titanic,
target.columns='age',
method.type='Equal-Width',
nbins=2,
label.prefix='label_prefix')
one_hot_encoding <- td_one_hot_encoding_fit_sqle(
data=titanic,
is.input.dense=TRUE,
target.column="sex",
categorical.values=c("male", "female"),
other.column="other")
ColumnTransformer_out <- td_column_transformer_sqle(
fillrowid.column.name="output_value",
input.data=titanic,
bincode.fit.data=bin_code$output,
onehotencoding.fit.data=one_hot_encoding$result)
# Print the result.
print(ColumnTransformer_out$result)
# Example 2: Perform multiple NonLinearCombineFit transformation
# using td_bincode_fit_sqle() function and
# td_column_transformer_sqle() function.
# Create multiple td_non_linear_combine_fit_sqle objects and
# pass those to td_column_transformer_sqle.
NLC_o1 <- td_non_linear_combine_fit_sqle(
data = titanic,
target.columns = c("sibsp", "parch"),
formula = "Y=(X0+X1+1)",
result.column = "total_cost")
NLC_o2 <- td_non_linear_combine_fit_sqle(
data = titanic,
target.columns = c( "fare"),
formula = "Y=(X0+1)",
result.column = "total_fare_cost2")
NLC_o3 <- td_non_linear_combine_fit_sqle(
data = titanic,
target.columns = c( "fare"),
formula = "Y=(X0+3)",
result.column = "total_fare_cost3")
ColumnTransformer_out2 <-
td_column_transformer_sqle(
input.data=titanic,
nonlinearcombine.fit.data=NLC_o1$result,
nonlinearcombine.fit.data.order.column=c('parch'),
nonlinearcombine.fit.data1=NLC_o2$result,
nonlinearcombine.fit.data1.order.column=c('fare'),
nonlinearcombine.fit.data2=NLC_o3$result)
# Print the result.
print(ColumnTransformer_out2$result)
# Example 3: Perform Bincode Fit transformation along with Oridnal Encoding
# Fit and One Hot Encoding Fit transformation using
# td_bincode_fit_sqle() and td_column_transformer_sqle() function.
# Create td_bincode_fit_sqle object along with td_ordinal_encoding_fit_sqle
# and td_one_hot_encoding_fit_sqle objects, and pass
# those to td_column_transformer_sqle.
bin_code <- td_bincode_fit_sqle(
data=titanic,
target.columns='age',
method.type='Equal-Width',
nbins=2,
label.prefix='label_prefix')
one_hot_encoding <- td_one_hot_encoding_fit_sqle(
data=titanic,
is.input.dense=TRUE,
target.column="embarked",
categorical.values=c("c", "s"),
other.column="other")
ord_enc_fit <- td_ordinal_encoding_fit_sqle(target.column='sex',
data=titanic)
ColumnTransformer_out3 <- td_column_transformer_sqle(
fillrowid.column.name="output_value",
input.data=titanic,
bincode.fit.data=bin_code$output,
onehotencoding.fit.data=one_hot_encoding$result,
ordinalencoding.fit.data=ord_enc_fit$result)
# Print the result.
print(ColumnTransformer_out3$result)