Teradata Package for R Function Reference | 17.00 - DecisionTreePredict - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

DecisionTreePredict

Description

The DecisionTreePredict function applies a tree model to input data, to output predicted labels for each data point.

Usage

  td_decision_tree_predict_sqle (
      object = NULL,
      newdata = NULL,
      attr.table.groupby.columns = NULL,
      attr.table.pid.columns = NULL,
      attr.table.val.column = NULL,
      accumulate = NULL,
      output.response.probdist = FALSE,
      output.responses = NULL,
      newdata.partition.column = NULL,
      newdata.order.column = NULL,
      object.order.column = NULL
  )
## S3 method for class 'td_decision_tree_mle'
predict(
      object = NULL,
      newdata = NULL,
      attr.table.groupby.columns = NULL,
      attr.table.pid.columns = NULL,
      attr.table.val.column = NULL,
      accumulate = NULL,
      output.response.probdist = FALSE,
      output.responses = NULL,
      newdata.partition.column = NULL,
      newdata.order.column = NULL,
      object.order.column = NULL)

Arguments

`object`	Required Argument. Specifies the model tbl_teradata generated by DecisionTree (`td_decision_tree_mle`) function. This argument can accept either a tbl_teradata or an object of "td_decision_tree_mle" class.
`object.order.column`	Optional Argument. Specifies Order By columns for "object". Values to this argument can be provided as a vector, if multiple columns are used for ordering. Types: character OR vector of Strings (character)
`newdata`	Required Argument. Specifies the name of the tbl_teradata containing the attribute names and the values.
`newdata.partition.column`	Required Argument. Specifies Partition By columns for "newdata". Values to this argument can be provided as a vector, if multiple columns are used for partition. Types: character OR vector of Strings (character)
`newdata.order.column`	Optional Argument. Specifies Order By columns for "newdata". Values to this argument can be provided as a vector, if multiple columns are used for ordering. Types: character OR vector of Strings (character)
`attr.table.groupby.columns`	Required Argument. Specifies the names of the columns on which "newdata" is partitioned. Each partition contains one attribute of the input data. Types: character OR vector of Strings (character)
`attr.table.pid.columns`	Required Argument. Specifies the names of the columns that define the data point identifiers. Types: character OR vector of Strings (character)
`attr.table.val.column`	Required Argument. Specifies the name of the column that contains the input values. Types: character
`accumulate`	Optional Argument. Specifies the names of "newdata" columns to copy to the output tbl_teradata. Types: character OR vector of Strings (character)
`output.response.probdist`	Optional Argument. Specifies whether to output probabilities. Note: "output.response.probdist" argument can accept input value True only when tdplyr is connected to Vantage 1.0 Maintenance Update 2 version or later. Default Value: FALSE Types: logical
`output.responses`	Optional Argument. Required if "output.response.probdist" is TRUE. Specifies all responses in newdata. Types: character OR vector of characters

Value

Function returns an object of class "td_decision_tree_predict_sqle" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using the name: result.

Examples

  
    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("decisiontreepredict_example", "iris_attribute_test")
    loadExampleData("decision_tree_example", "iris_attribute_train", "iris_response_train",
                    "iris_altinput")
    
    # Create object(s) of class "tbl_teradata".
    iris_attribute_train <- tbl(con, "iris_attribute_train")
    iris_response_train <- tbl(con, "iris_response_train")
    iris_attribute_test <- tbl(con, "iris_attribute_test")
    
    # Example -
    # First train the data, i.e. create a Model
    decision_tree_out <- td_decision_tree_mle(attribute.name.columns = c("attribute"),
                               attribute.value.column = "attrvalue",
                               id.columns = c("pid"),
                               attribute.table = iris_attribute_train,
                               response.table = iris_response_train,
                               response.column = "response",
                               num.splits = 3,
                               approx.splits = FALSE,
                               nodesize = 10,
                               max.depth = 10,
                               split.measure = "gini"
                               )
    
    # Run predict on the output of td_decision_tree_mle() function.
    td_decision_tree_predict_out <- td_decision_tree_predict_sqle(object = decision_tree_out,
                                       newdata = iris_attribute_test,
                                       newdata.partition.column = c("pid"),
                                       newdata.order.column = c("attribute"),
                                       attr.table.groupby.columns = c("attribute"),
                                       attr.table.pid.columns = c("pid"),
                                       attr.table.val.column = "attrvalue",
                                       accumulate = c("attrvalue")
                                       )
    
    # Alternatively use S3 predict function to run predict on the output of
    # td_decision_tree_mle() function.
    predict_out <- predict(decision_tree_out,
                      newdata = iris_attribute_test,
                      newdata.partition.column = c("pid"),
                      newdata.order.column = c("attribute"),
                      attr.table.groupby.columns = c("attribute"),
                      attr.table.pid.columns = c("pid"),
                      attr.table.val.column = "attrvalue",
                      accumulate = c("attrvalue")
                      )