Teradata R Package Function Reference - 16.20 - DecisionForestPredict - Teradata R Package

Teradata® R Package Function Reference

prodname
Teradata R Package
vrm_release
16.20
created_date
February 2020
category
Programming Reference
featnum
B700-4007-098K

Description

The Decision Forest Predict td_decision_forest_predict_sqle function uses the model generated by the Decision Forest td_decision_forest_mle function to generate predictions on a response variable for a test set of data. The model can be stored in a table in the Advanced SQL Engine. It can then be accessed by using the tbl() function.

Usage

  td_decision_forest_predict_sqle (
      object = NULL,
      newdata = NULL,
      id.column = NULL,
      detailed = FALSE,
      terms = NULL
  )
  
## S3 method for class 'td_decision_forest_mle'
predict(
      object = NULL, 
      newdata = NULL,
      id.column = NULL,
      detailed = FALSE,
      terms = NULL)

Arguments

object

Required Argument.
Specifies the name of the object that contains the decision forest model which is the output of function td_decision_forest_mle. For td_decision_forest_predict_sqle, this can also be the tibble containing the model table of decision forest model.

newdata

Required Argument.
Specifies the tbl_teradata containing the input test data.

id.column

Required Argument.
Specifies a column containing a unique identifier for each test point in the test set.

detailed

Optional Argument.
Specifies whether to output detailed information about the forest trees; that is, the decision tree and the specific tree information, including task index and tree index for each tree.
Default Value: FALSE

terms

Optional Argument.
Specifies the names of the input columns to copy to the output table.

Value

Function returns an object of class "td_decision_forest_predict_sqle" which is a named list containing Teradata tbl object.
Named list member can be referenced directly with the "$" operator using name: result.

Examples

    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("decisionforestpredict_example", "housing_test", "housing_train")
    
    # Create remote tibble objects.
    housing_test <- tbl(con, "housing_test")
    housing_train <- tbl(con, "housing_train")
    
    # Example 1 -
    # First train the data, i.e., create a decision forest Model
    formula <- (homestyle ~ driveway + recroom + fullbase + gashw + airco + prefarea + price + lotsize + bedrooms + bathrms + stories + garagepl)
    decision_forest_model <- td_decision_forest_mle(data=housing_train,
                                  formula = formula,
                                  tree.type="classification",
                                  ntree=50,
                                  tree.size=100,
                                  nodesize=1,
                                  variance=0,
                                  max.depth=12,
                                  maxnum.categorical=20,
                                  mtry=3,
                                  mtry.seed=100,
                                  seed=100
                                  )
    
    # Run predict on the output of td_decision_forest_mle
    td_decision_forest_predict_out <- td_decision_forest_predict_sqle(object = decision_forest_model,
                                       newdata = housing_test,
                                       id.column = "sn",
                                       detailed = FALSE,
                                       terms = c("homestyle")
                                       )
     
     #Alternatively use the predict S3 method to find predictions.
     predict_out <- predict(decision_forest_model,
                       newdata = housing_test,
                       id.column = "sn",
                       detailed = FALSE,
                       terms = c("homestyle")
                       )