OneHotEncodingFit
Description
The td_one_hot_encoding_fit_sqle()
function outputs a tbl_teradata of attributes and categorical
values to input to td_one_hot_encoding_transform_sqle()
function, which encodes them as
one-hot numeric vectors.
Notes:
This function requires the UTF8 client character set for UNICODE data.
This function does not support Pass Through Characters (PTCs).
This function does not support KanjiSJIS or Graphic data types.
Usage
td_one_hot_encoding_fit_sqle (
data = NULL,
category.data = NULL,
target.column = NULL,
attribute.column = NULL,
value.column = NULL,
is.input.dense = NULL,
approach = "LIST",
categorical.values = NULL,
target.column.names = NULL,
categories.column = NULL,
other.column = "other",
category.counts = NULL,
target.attributes = NULL,
other.attributes = NULL,
...
)
Arguments
data |
Required Argument. |
category.data |
Optional Argument. |
target.column |
Required when "is.input.dense" is set to TRUE, disallowed otherwise.
Types: character OR vector of Strings (character) |
attribute.column |
Required when "is.input.dense" is set to FALSE, disallowed otherwise. |
value.column |
Required when "is.input.dense" is set to FALSE, disallowed otherwise. |
is.input.dense |
Required Argument. |
approach |
Optional Argument. |
categorical.values |
Required when "approach" is set to 'LIST' and a single value
is present in "target.column", optional otherwise.
Types: character OR vector of Strings (character) |
target.column.names |
Required when "category.data" is used, optional otherwise. |
categories.column |
Required when "category.data" is used, optional otherwise. |
other.column |
Optional when "is.input.dense" is set to TRUE, disallowed otherwise. |
category.counts |
Required when "category.data" is used or "approach" is
set to 'auto', optional otherwise. |
target.attributes |
Required when "is.input.dense" is set to FALSE, disallowed otherwise. |
other.attributes |
Optional when "is.input.dense" is set to FALSE, disallowed otherwise.
Types: character OR vector of Strings (character) |
... |
Specifies the generic keyword arguments SQLE functions accept. persist: volatile: Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:
Note: |
Value
Function returns an object of class "td_one_hot_encoding_fit_sqle"
which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator
using the name(s):result
Examples
# Get the current context/connection.
con <- td_get_context()$connection
# Load the example data.
loadExampleData("tdplyr_example", "titanic", "cat_table")
# Create tbl_teradata object.
titanic_data <- tbl(con, "titanic")
cat_data <- tbl(con, "cat_table")
# Check the list of available analytic functions.
display_analytic_functions()
# Example 1: Generate fit object to encode 'male' and 'female' values of column 'sex'.
fit_obj1 <- td_one_hot_encoding_fit_sqle(data=titanic_data,
is.input.dense=TRUE,
target.column="sex",
categorical.values=c("male", "female"),
other.column="other")
# Print the result.
print(fit_obj1$result)
# Example 2: Generate fit object to encode column 'sex' and 'embarked' in dataset.
fit_obj2 <- td_one_hot_encoding_fit_sqle(data=titanic_data,
is.input.dense=TRUE,
approach="auto",
target.column=c("sex", "embarked"),
category.counts=c(2, 3),
other.column="other")
# Print the result.
print(fit_obj2$result)
# Example 3: Generate fit object when "category.data" is used.
fit_obj3 <- td_one_hot_encoding_fit_sqle(data=titanic_data,
category.data=cat_data,
target.column.names="column_name",
categories.column="category",
is.input.dense=TRUE,
target.column=c("sex", "embarked", "name"),
category.counts=c(2, 4, 6),
other.column="other")
# Print the result.
print(fit_obj3$result)
# Example 4: Generate fit object when "approach" is set to 'LIST'.
fit_obj4 <- td_one_hot_encoding_fit_sqle(data=titanic_data,
is.input.dense=TRUE,
approach="list",
categorical.values=c('male','female'),
target.column=c("sex"),
other.column="other")
# Print the result.
print(fit_obj4$result)