Teradata Package for R Function Reference | 17.20 - DecisionForestPredict - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for R
Release Number
17.20
Published
March 2024
ft:locale
en-US
ft:lastEdition
2024-05-03
dita:id
TeradataR_FxRef_Enterprise_1720
lifecycle
latest
Product Category
Teradata Vantage

DecisionForestPredict

Description

The td_decision_forest_predict_mle_sqle() function uses the model generated by the td_decision_forest_mle() function to generate predictions on a response variable for a test set of data.
The model can be stored in either a tbl_teradata or a td_decision_forest_mle object.

Usage

  td_decision_forest_predict_mle_sqle(
      object = NULL,
      newdata = NULL,
      id.column = NULL,
      detailed = FALSE,
      terms = NULL,
      output.prob = FALSE,
      output.responses = NULL,
      ...
  )

Arguments

object

Required Argument.
Specifies the tbl_teradata which contains the model data generated by the td_decision_forest_mle() function or object of td_decision_forest_mle().
Types: tbl_teradata or object of td_decision_forest_mle

newdata

Required Argument.
Specifies the name of the tbl_teradata containing the
attribute names and the values.
Types: tbl_teradata

id.column

Required Argument.
Specifies a column containing a unique identifier for each test point in the test set.
Types: character

detailed

Optional Argument.
Specifies whether to output detailed information about the forest trees; that is, the decision tree and the specific tree information, including task index and tree index for each tree.
Default Value: FALSE
Types: logical

terms

Optional Argument.
Specifies the names of the newdata columns to copy to the output tbl_teradata.
Types: character OR vector of Strings (character)

output.prob

Optional Argument.
Specifies whether to output probabilities.
Default Value: FALSE
Types: logical

output.responses

Optional Argument.
Specifies responses for which to output probabilities.
Types: character OR vector of Strings (character)

...

Specifies the generic keyword arguments SQLE functions accept.
Below are the generic keyword arguments:
persist:
Optional Argument.
Specifies whether to persist the results of the function in a table or not.
When set to TRUE, results are persisted in a table; otherwise, results are garbage collected at the end of the session.
Default Value: FALSE
Types: logical

volatile:
Optional Argument.
Specifies whether to put the results of the function in a volatile table or not.
When set to TRUE, results are stored in a volatile table, otherwise not.
Default Value: FALSE
Types: logical

Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:

  • "<input.data.arg.name>.partition.column" accepts character OR vector of Strings (character) (Strings)

  • "<input.data.arg.name>.hash.column" accepts character OR vector of Strings (character) (Strings)

  • "<input.data.arg.name>.order.column" accepts character OR vector of Strings (character) (Strings)

  • "local.order.<input.data.arg.name>" accepts logical

Note:
These generic arguments are supported by tdplyr if the underlying SQL Engine function supports, else an exception is raised.

Value

Function returns an object of class "td_decision_forest_predict_mle_sqle" which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator using the name(s):result

Examples

  
    
    # Get the current context/connection.
    con <- td_get_context()$connection

    # Load example data.
    loadExampleData("decisionforestpredict_example", "housing_test", "housing_train")

    # Create object(s) of class "tbl_teradata".
    housing_test <- tbl(con, "housing_test")
    housing_train <- tbl(con, "housing_train")

    # Example 1 -
    # First train the data, i.e., create a decision forest Model.
    formula <- (homestyle ~ driveway + recroom + fullbase + gashw + airco + prefarea + price
                + lotsize + bedrooms + bathrms + stories + garagepl)

    decision_forest_model <- td_decision_forest_mle(data=housing_train,
                                                    formula = formula,
                                                    tree.type="classification",
                                                    ntree=50,
                                                    tree.size=100,
                                                    nodesize=1,
                                                    variance=0,
                                                    max.depth=12,
                                                    maxnum.categorical=20,
                                                    mtry=3,
                                                    mtry.seed=100,
                                                    seed=100
                                                    )

    # Run predict on the output of td_decision_forest_mle() function.
    td_decision_forest_predict_out <- td_decision_forest_predict_mle_sqle(
                                       object = decision_forest_model,
                                       newdata = housing_test,
                                       id.column = "sn",
                                       detailed = FALSE,
                                       terms = c("homestyle")
                                       )

    # Print the result.
    print(decision_forest_predict_out$result)