Teradata Package for R Function Reference | 17.00 - td_decision_tree_predict_valib - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Product
Teradata Package for R
Release Number
17.00
Published
July 2021
Language
English (United States)
Last Update
2023-08-08
dita:id
B700-4007
NMT
no
Product Category
Teradata Vantage
Gain Ratio Decision Tree Predict

Description

The function predicts the values of the dependent variable in test data, using the model created by td_decision_tree_valib(). The function also generates two profile outputs containing the details about the decisions made during the prediction. Apart from the score and profile outputs, the function optionally generates:

  1. Confidence factors

  2. Targeted Binary Confidence

Usage

td_decision_tree_predict_valib(model, data, ...)

Arguments

model

Required Argument.
Specifies an object of class tbl_teradata generated by td_decision_tree_valib() function, containing the decision tree model in PMML format that is used to predict the data.
Types: tbl_teradata

data

Required Argument.
Specifies the input data containing the columns to analyse, representing the dependent and independent variables in the analysis.
Types: tbl_teradata

...

Specifies other arguments supported by the function as described in the 'Other Arguments' section.

Value

Function returns an object of class "td_decision_tree_predict_valib" which is a named list containing objects of class "tbl_teradata".
Named list members can be referenced directly with the "$" operator using names:

  1. result

  2. profile.result.1

  3. profile.result.2

Other Arguments

include.confidence

Optional Argument.
Specifies whether the output tbl_teradata contain a column indicating how likely it is, for a particular leaf node on the tree, that the prediction is correct. If not specified or set to FALSE, the confidence column is not created.
Note: This argument cannot be specified along with "targeted.value" argument.
Default Value: FALSE
Types: logical

index.columns

Optional Argument.
Specifies one or more different columns for the primary index of the result output tbl_teradata. By default, the primary index columns of the result output tbl_teradata are the primary index columns of the input tbl_teradata "data". In addition, the columns specified in this argument need to form a unique key for the result output tbl_teradata. Otherwise, there are more than one score for a given observation.
Types: character OR list of Strings (character)

response.column

Optional Argument.
Specifies the name of the predicted value column. If this argument is not specified, the name of the dependent column in "data" tbl_teradata is used.
Types: character

accumulate

Optional Argument.
Specifies one or more columns from the "data" tbl_teradata that can be passed to the result output tbl_teradata.
Types: character OR list of Strings (character)

targeted.value

Optional Argument.
Specifies whether the result output tbl_teradata contain a column indicating how likely it is, for a particular leaf node and targeted value of a predicted result with only two values, that the prediction is correct.
Note: This argument cannot be specified along with "include.confidence" argument.
Permitted values: One of the values in the dependent column used in the argument "response.column".
Types: character

Examples


# Notes:
#   1. To execute Vantage Analytic Library functions, set option 'val.install.location' to
#      the database name where Vantage analytic library functions are installed.
#   2. Datasets used in these examples can be loaded using Vantage Analytic Library installer.

# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")

# Get remote data source connection.
con <- td_get_context()$connection

# Create an object of class "tbl_teradata".
df <- tbl(con, "customer_analysis")
print(df)

# Run DecisionTree() on columns "age", "income" and "nbr_children", with dependent 
# variable "gender".
dt_obj <- td_decision_tree_valib(data=df,
                                 columns=c("age", "income", "nbr_children"),
                                 response.column="gender",
                                 algorithm="gainratio",
                                 binning=FALSE,
                                 max.depth=5,
                                 num.splits=2,
                                 pruning="gainratio")


# Example 1: Predict the likeliness for a particular leaf node in the tree.
obj <- td_decision_tree_predict_valib(data=df,
                                      model=dt_obj$result,
                                      include.confidence=TRUE,
                                      accumulate=c("city_name", "state_code"))
# Print the results.
print(obj$result)
print(obj$profile.result.1)
print(obj$profile.result.2)

# Example 2: Predict the likeliness for a particular leaf node in the tree and binary 
#            targeted value using S3 predict().
obj <- predict(object=dt_obj, 
               data=df,
               targeted.value="F",
               accumulate=c("city_name", "state_code"))
               
# Print the results.
print(obj$result)
print(obj$profile.result.1)
print(obj$profile.result.2)