Teradata R Package Function Reference | 17.00 - 17.00 - DecisionForestPredict - Teradata R Package

Teradata® R Package Function Reference

prodname
Teradata R Package
vrm_release
17.00
created_date
September 2020
category
Programming Reference
featnum
B700-4007-090K

Description

The DecisionForestPredict function uses the model generated by the DecisionForest (td_decision_forest_mle) function to generate predictions on a response variable for a test set of data. The model can be stored in either a tbl_teradata or a DecisionForest object.

Usage

  td_decision_forest_predict_sqle (
      object = NULL,
      newdata = NULL,
      id.column = NULL,
      detailed = FALSE,
      terms = NULL,
      newdata.order.column = NULL,
      object.order.column = NULL
  )
## S3 method for class 'td_decision_forest_mle'
predict(
      object = NULL,
      newdata = NULL,
      id.column = NULL,
      detailed = FALSE,
      terms = NULL,
      newdata.order.column = NULL,
      object.order.column = NULL)

Arguments

object

Required Argument.
Specifies the model tbl_teradata generated by td_decision_forest_mle.
This argument can accept either a tbl_teradata or an object of "td_decision_forest_mle" class.

object.order.column

Optional Argument.
Specifies Order By columns for "object".
Values to this argument can be provided as a vector, if multiple columns are used for ordering.
Types: character OR vector of Strings (character)

newdata

Required Argument.
Specifies the tbl_teradata containing the input test data.

newdata.order.column

Optional Argument.
Specifies Order By columns for "newdata".
Values to this argument can be provided as a vector, if multiple columns are used for ordering.
Types: character OR vector of Strings (character)

id.column

Required Argument.
Specifies a column containing a unique identifier for each test point in the test set.
Types: character

detailed

Optional Argument.
Specifies whether to output detailed information about the forest trees; that is, the decision tree and the specific tree information, including task index and tree index for each tree.
Default Value: FALSE
Types: logical

terms

Optional Argument.
Specifies the names of the input columns to copy to the output tbl_teradata.
Types: character OR vector of Strings (character)

Value

Function returns an object of class "td_decision_forest_predict_sqle" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using the name: result.

Examples

    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("decisionforestpredict_example", "housing_test", "housing_train")
    
    # Create object(s) of class "tbl_teradata".
    housing_test <- tbl(con, "housing_test")
    housing_train <- tbl(con, "housing_train")
    
    # Example 1 -
    # First train the data, i.e., create a decision forest Model.
    formula <- (homestyle ~ driveway + recroom + fullbase + gashw + airco + prefarea + price
                + lotsize + bedrooms + bathrms + stories + garagepl)
    decision_forest_model <- td_decision_forest_mle(data=housing_train,
                                  formula = formula,
                                  tree.type="classification",
                                  ntree=50,
                                  tree.size=100,
                                  nodesize=1,
                                  variance=0,
                                  max.depth=12,
                                  maxnum.categorical=20,
                                  mtry=3,
                                  mtry.seed=100,
                                  seed=100
                                  )
    
    # Run predict on the output of td_decision_forest_mle() function.
    td_decision_forest_predict_out <- td_decision_forest_predict_sqle(
                                       object = decision_forest_model,
                                       newdata = housing_test,
                                       id.column = "sn",
                                       detailed = FALSE,
                                       terms = c("homestyle")
                                       )
     
     # Alternatively use the predict S3 method to find predictions.
     predict_out <- predict(decision_forest_model,
                       newdata = housing_test,
                       id.column = "sn",
                       detailed = FALSE,
                       terms = c("homestyle")
                       )