TDDecisionForestPredict
Description
td_decision_forest_predict_sqle()
function uses the model output by
td_decision_forest_sqle()
function to analyze the input data
and make predictions. This function outputs
the probability that each observation is in the predicted class.
Processing times are controlled by the number of trees in the model.
When the number of trees is more than what can fit in memory, then
the trees are cached in a local spool space.
Usage
td_decision_forest_predict_sqle (
newdata = NULL,
object = NULL,
id.column = NULL,
detailed = FALSE,
output.prob = FALSE,
output.responses = NULL,
accumulate = NULL,
...
)
Arguments
newdata |
Required Argument. |
object |
Required Argument. |
id.column |
Required Argument. |
detailed |
Optional Argument. |
output.prob |
Optional Argument.
Default Value: FALSE |
output.responses |
Optional Argument.
Types: character OR vector of str(s) |
accumulate |
Optional Argument. |
... |
Specifies the generic keyword arguments SQLE functions accept. Below
are the generic keyword arguments: volatile: Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:
Note: |
Value
Function returns an object of class "td_decision_forest_predict_sqle"
which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator
using the name(s):result
Examples
# Get the current context/connection.
con <- td_get_context()$connection
# Load the example data.
loadExampleData("pmmlpredict_example", "boston")
loadExampleData("tdplyr_example", "iris_input")
# Create tbl_teradata object.
boston <- tbl(con, "boston")
iris_input <- tbl(con, "iris_input")
# Check the list of available analytic functions.
display_analytic_functions()
# Example 1 : This example takes boston data as input, and generates the Regression
# model using td_decision_forest_sqle().
# Using td_decision_forest_predict_sqle()
# function to predict the medv with the Regression model
# generated by td_decision_forest_sqle().
# Create 2 samples of input data - sample 1 will have 80% of total rows and
# sample 2 will have 20% of total rows.
boston_sample <- td_sample(df = boston, n = c(0.8, 0.2))
# Create train dataset from sample 1 by filtering on "sampleid" and drop
# "sampleid" column as it is not required for training model.
boston_train = boston_sample
# Create test dataset from sample 2 by filtering on "sampleid" and
# drop "sampleid" column as it is not required for scoring.
boston_test = boston_sample
# Training the model.
DecisionForest_out <- td_decision_forest_sqle(
data = boston_train,
input.columns = c('crim', 'zn', 'indus', 'chas',
'nox', 'rm', 'age', 'dis',
'rad', 'tax','ptratio',
'black', 'lstat'),
response.column = 'medv',
max.depth = 12,
num.trees = 4,
min.node.size = 1,
mtry = 3,
mtry.seed = 1,
seed = 1,
tree.type = 'REGRESSION')
# td_decision_forest_predict_sqle() predicts the result using generated Regression model by
# td_decision_forest_sqle() and "newdata".
TDDecisionForestPredict_out <- td_decision_forest_predict_sqle(
newdata=boston_test,
object=DecisionForest_out,
id.column="id")
# Print the result.
print(TDDecisionForestPredict_out$result)
# Example 2 : This example takes iris_input data, and generates the Classification
# model using td_decision_forest_sqle().
# Using td_decision_forest_predict_sqle() function
# to predict the species with the Classification model
# generated by td_decision_forest_sqle().
# Provides the classes for which to output the probabilities.
# Create 2 samples of input data - sample 1 will have 80% of total rows and
# sample 2 will have 20% of total rows.
iris_sample <- td_sample(df = iris_input, n = c(0.8, 0.2))
# Create train dataset from sample 1 by filtering on "sampleid" and drop
# "sampleid" column as it is not required for training model.
iris_train = iris_sample
# Create test dataset from sample 2 by filtering on "sampleid" and
# drop "sampleid" column as it is not required for scoring.
iris_test = iris_sample
# Training the model.
DecisionForest_out_2 <- td_decision_forest_sqle(
data=iris_train,
input.columns=c('sepal_length',
'sepal_width',
'petal_length',
'petal_width'),
response.column="species",
tree.type="CLASSIFICATION")
# td_decision_forest_predict_sqle() predicts the result using generated
# Regression model by td_decision_forest_sqle() and "newdata".
TDDecisionForestPredict_out_2 <- td_decision_forest_predict_sqle(
newdata=iris_test,
object=DecisionForest_out_2,
id.column="id",
output.prob=TRUE,
output.responses=c('1', '2', '3'))
# Print the result.
print(TDDecisionForestPredict_out_2$result)
# Alternatively use S3 predict function to run predict on the output of
# td_decision_forest_sqle() function.
TDDecisionForestPredict_out_2 <- predict(DecisionForest_out_2,
newdata=iris_test,
id.column="id",
output.prob=TRUE,
output.responses=c('1', '2', '3'))
# Print the result.
print(TDDecisionForestPredict_out_2$result)