OrdinalEncodingFit
Description
td_ordinal_encoding_fit_sqle()
function identifies distinct categorical
values from the input data or a user-defined list and generates
the distinct categorical values along with the ordinal value for
each category.
Notes:
Function requires the UTF8 client character set for UNICODE data.
Function does not support Pass Through Characters (PTCs).
Function does not support KanjiSJIS or Graphic data types.
The maximum number of unique categories in a particular column is 4000.
The maximum category length is 128 characters.
NULL categories are not encoded.
Usage
td_ordinal_encoding_fit_sqle (
data = NULL,
category.data = NULL,
target.column = NULL,
approach = "AUTO",
categories = NULL,
ordinal.values = NULL,
target.column.names = NULL,
categories.column = NULL,
ordinal.values.column = NULL,
start.value = 0,
default.value = NULL,
...
)
Arguments
data |
Required Argument. |
category.data |
Optional Argument. |
target.column |
Required Argument. |
approach |
Optional Argument. |
categories |
Optional Argument.
Types: character OR vector of Strings (character) |
ordinal.values |
Optional Argument. However, if user only specify the ordinal values, then each ordinal value
is associated with a categorical value. For example, if there are three categories
and the ordinal values are 3, 4, 5 then the ordinal values are assigned to the
respective categories.
Types: integer OR vector of integers |
target.column.names |
Required when "category.data" is used, optional otherwise. |
categories.column |
Required when "category.data" is used, optional otherwise. |
ordinal.values.column |
Required when "category.data" is used, optional otherwise. |
start.value |
Optional Argument. |
default.value |
Optional Argument. |
... |
Specifies the generic keyword arguments SQLE functions accept. Below volatile: Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:
Note: |
Value
Function returns an object of class "td_ordinal_encoding_fit_sqle"
which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator
using the name(s):
result
output.data
Examples
# Get the current context/connection.
con <- td_get_context()$connection
# Load the example data.
loadExampleData("tdplyr_example", "titanic","cat_table")
# Create tbl_teradata object.
titanic <- tbl(con, "titanic")
cat_data <- tbl(con, "cat_table")
# Check the list of available analytic functions.
display_analytic_functions()
# Example 1: identifying distinct categorical values from the input.
ordinal_encodingfit_res_1 <- td_ordinal_encoding_fit_sqle(target.column='sex',
data=titanic)
# Print the result.
print(ordinal_encodingfit_res_1$result)
# Example 2: Identifying distinct categorical values from the input and
# returns the distinct categorical values along with the ordinal
# value for each category.
ordinal_encodingfit_res_2 <- td_ordinal_encoding_fit_sqle(
target.column='sex',
approach='LIST',
categories=c('category0',
'category1'),
ordinal.values=c(1, 2),
start.value=0,
default.value=-1,
data=titanic)
# Print the result.
print(ordinal_encodingfit_res_2$result)
# Example 3: Provide ordinal values to "target.column" using
# dataset by "category.data".
ordinal_encodingfit_res_3 <- td_ordinal_encoding_fit_sqle(
target.column=c('name',
'sex',
'ticket',
'cabin',
'embarked'),
category.data=cat_data,
approach='LIST',
target.column.names="column_name",
categories.column="category",
ordinal.values.column="ordinal_value",
default.value=c(-1, -10, -15, 20, 0),
data=titanic)
# Print the result.
print(ordinal_encodingfit_res_3$result)