Description
tdBinning()
allows user to perform bin coding to replaces continuous
numeric column with a categorical one to produce ordinal values (for example,
numeric categorical values where order is meaningful). Binning uses the
same techniques used in Histogram analysis, allowing you to choose between:
equal-width bins
equal-width bins with a user-specified minimum and maximum range
bins with a user-specified width
evenly distributed bins
bins with user-specified boundaries
If the minimum and maximum are specified, all values less than the minimum
are put into bin 0, while all values greater than the maximum are put into
bin N+1. The same is true when the boundary option is specified.
tdBinning()
supports numeric and date type columns. If date values are
entered, the keyword DATE must precede the date value, and can not be
enclosed in single quotes.
Note:
Output of this function is passed to "bins" argument of
td_transform_valib()
.
Usage
tdBinning(columns, datatype=NULL, style="bins",
value=10, lbound=NULL, ubound=NULL,
fillna=NULL, ...)
Arguments
columns |
Required Argument. | ||||||||||||||||||||||||||||||||||||
datatype |
Optional Argument.
Notes:
Examples:
Types: character | ||||||||||||||||||||||||||||||||||||
style |
Optional Argument.
Default Value: 'bins' | ||||||||||||||||||||||||||||||||||||
value |
Optional Argument.
Note:
Default Value: 10 | ||||||||||||||||||||||||||||||||||||
lbound |
Optional Argument.
Types: integer, numeric, character | ||||||||||||||||||||||||||||||||||||
ubound |
Optional Argument.
Types: integer, numeric, character | ||||||||||||||||||||||||||||||||||||
fillna |
Optional Argument.
Types: tdFillNa | ||||||||||||||||||||||||||||||||||||
... |
Optional Argument. Required if style is 'boundaries'.
Types: integer, numeric, character |
Value
An object of tdBinning class.
Examples
Notes:
# 1. To run any transformation, user needs to use td_transform_valib()
# function.
# 2. To do so set option 'val.install.location' to the database name where
# Vantage analytic library functions are installed.
# 3. Datasets used in these examples can be loaded using Vantage Analytic
# Library installer.
# Get the current context/connection
con <- td_get_context()$connection
# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")
# Create object(s) of class "tbl_teradata".
ibm_stock <- tbl(con, "ibm_stock")
ibm_stock
# Example 1: Binning is carried out with 'bins' style, i.e. equal-width
# binning, with 5 number of bins. Null replacement is also
# combined with binning. "key.columns" argument must be used
# with td_transform_valib() function, when null
# replacement is being done.
# Perform the binning transformation using
# td_transform_valib() function from Vantage Analytic
# Library.
# Create tdFillNa object.
fn <- tdFillNa(style="literal", value=0)
# Create tdBinning object.
bins <- tdBinning(style="bins", value=5, columns="stockprice", fillna=fn)
# Perform the binning transformation using td_transform_valib() function.
obj <- td_transform_valib(data=ibm_stock, bins=bins, key.columns="id")
obj$result
# Example 2: Binning is carried out with multiple styles.
# Perform the binning transformation using
# td_transform_valib() function from Vantage Analytic
# Library.
# 'binswithboundaries' style:
# Equal-width bins with a user-specified minimum and maximum range on
# 'period' column. Resultant output return the value with the same
# column name. Number of bins created are 5.
# Create tdBinning object.
bins_1 <- tdBinning(style="binswithboundaries", value=5,
lbound="DATE 1962-01-01",
ubound="DATE 1962-06-01",
columns="period")
# 'boundaries' style:
# Bins created with user specified boundaries on 'period' column.
# Resultant column is names as 'period2'. Three boundaries are
# specified with arguments "b1", "b2" and "b3".
# When using this style, keyword argument names must start with
# 'b' and they should be in sequence b1, b2, ..., bN.
# Create tdBinning object.
bins_2 <- tdBinning(style="boundaries", b1="DATE 1962-01-01",
b2="DATE 1962-06-01", b3="DATE 1962-12-31",
columns=list("period"="period2"))
# Perform the binning transformation using td_transform_valib() function.
obj <- td_transform_valib(data=ibm_stock, bins=c(bins_1, bins_2))
obj$result
# Example 3: Binning is carried out with multiple styles 'quantiles' and
# 'width'.
# Perform the binning transformation using
# td_transform_valib() function from Vantage Analytic
# Library.
# 'quantiles' style:
# Evenly distributed bins on 'stockprice' column. Resultant output
# return the column with name 'stockprice_q'. Number of quantiles
# considered here are 4.
# Create tdBinning object.
bins_1 <- tdBinning(style="quantiles", value=4,
columns=list("stockprice"="stockprice_q"))
# 'width' style:
# Bins with user specified width on 'stockprice' column.
# Resultant output returns the column with name 'stockprice_w'.
# Width considered for binning is 5.
# Create tdBinning object.
bins_2 <- tdBinning(style="width", value=5,
columns=list("stockprice"="stockprice_w"))
# Perform the binning transformation using td_transform_valib() function.
obj <- td_transform_valib(data=ibm_stock, bins=c(bins_1, bins_2))
obj$result