Teradata Package for R Function Reference | 17.20 - GLMPredict - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for R
Release Number
17.20
Published
March 2024
ft:locale
en-US
ft:lastEdition
2024-05-03
dita:id
TeradataR_FxRef_Enterprise_1720
lifecycle
latest
Product Category
Teradata Vantage

TDGLMPredict

Description

The td_glm_predict_sqle() function predicts target values (regression) and class labels (classification) for test data using a model generated by the td_glm_sqle().
Notes:

  • Before using the features in the function, user must standardize the Input features using td_scale_fit_sqle() and td_scale_transform_sqle() functions.

  • The function only accepts numeric features. Therefore, user must convert the categorical features to numeric values before prediction.

  • The function skips the rows with missing (null) values during prediction.

  • User can use td_regression_evaluator_sqle(), td_classification_evaluator_sqle(), or td_roc_sqle() function as a post-processing step for evaluating prediction results.

  • The td_glm_predict_sqle() function accepts models from td_glm_sqle() function in SQLE.

Usage

  td_glm_predict_sqle (
      object = NULL,
      newdata = NULL,
      id.column = NULL,
      accumulate = NULL,
      output.prob = FALSE,
      output.responses = NULL,
      ...
  )

Arguments

object

Required Argument.
Specifies the tbl_teradata containing the model data generated by td_glm_sqle() function or the instance of td_glm_sqle.
Types: tbl_teradata or td_glm_sqle

newdata

Required Argument.
Specifies the tbl_teradata containing the input data.
Types: tbl_teradata

id.column

Required Argument.
Specifies the name of the column that uniquely identifies an observation in the test data.
Types: character

accumulate

Optional Argument.
Specifies the name(s) of input tbl_teradata column(s) to copy to the output. By default, the function copies no input tbl_teradata columns to the output.
Types: character OR vector of Strings (character)

output.prob

Optional Argument.
Specifies whether the function should output the probability for each response.
Note:
Only applicable if the "family" is 'Binomial'.
Default Value: FALSE
Types: logical

output.responses

Optional Argument.
Specifies the class labels for which to output probabilities.
A label must be 0 or 1. If not specified, the function outputs the probability of the predicted response.
Note:
Only applicable if "output.prob" is TRUE.
Types: character OR vector of Strings (character)

...

Specifies the generic keyword arguments SQLE functions accept. Below are the generic keyword arguments:

persist:
Optional Argument.
Specifies whether to persist the results of the
function in a table or not. When set to TRUE, results are persisted in a table; otherwise, results are garbage collected at the end of the session.
Default Value: FALSE
Types: logical

volatile:
Optional Argument.
Specifies whether to put the results of the
function in a volatile table or not. When set to TRUE, results are stored in a volatile table, otherwise not.
Default Value: FALSE
Types: logical

Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:

  • "<input.data.arg.name>.partition.column" accepts character or vector of character (Strings)

  • "<input.data.arg.name>.hash.column" accepts character or vector of character (Strings)

  • "<input.data.arg.name>.order.column" accepts character or vector of character (Strings)

  • "local.order.<input.data.arg.name>" accepts logical

Note:
These generic arguments are supported by tdplyr if the underlying SQLE Engine function supports, else an exception is raised.

Value

Function returns an object of class "td_glm_predict_sqle" which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator using the name(s):result

Examples

  
    
    # Get the current context/connection.
    con <- td_get_context()$connection
    
    # Load the example data.
    loadExampleData("tdplyr_example", "cal_housing_ex_raw")
    
    # Create tbl_teradata object.
    df <- tbl(con, "cal_housing_ex_raw")
    
    # Check the list of available analytic functions.
    display_analytic_functions()
    
    # Example 1: This example takes raw housing data, and does the following
    #            Uses td_scale_fit_sqle() to standardize the data.
    #            Uses td_scale_transform_sqle() to transform the data.
    #            Uses td_glm_sqle() to generate a model.
    #            Uses td_glm_predict_sqle() to predict target values.
    
    # Scale "target_columns" with respect to 'STD' value of the column.
    fit_obj <- td_scale_fit_sqle(
                       data=df,
                       target.columns=c('MedInc', 'HouseAge', 'AveRooms',
                                        'AveBedrms', 'Population', 'AveOccup',
                                        'Latitude', 'Longitude'),
                       scale.method="STD")
    
    # Scale values specified in the input data using the fit data generated by
    # the td_scale_fit_sqle() function above.
    obj <- td_scale_transform_sqle(
                         object=fit_obj$output,
                         data=df,
                         accumulate=c("id","MedHouseVal"))
    
    # Generate regression model using generalized linear model(GLM).
    answer <- td_glm_sqle(
               input.columns=c("MedInc", "HouseAge", "AveRooms",
                               "AveBedrms", "Population", "AveOccup",
                               "Latitude", "Longitude"),
               response.column="MedHouseVal",
               data=obj$result,
               nesterov=FALSE)
    
    # td_glm_predict_sqle() predicts 'MedHouseVal' using generated regression
    # model by GLM and newdata.
    # Note that tbl_teradata representing the model is passed as input to "object".
    TDGLMPredict_out <- td_glm_predict_sqle(
                                    object=answer$result,
                                    newdata=obj$result,
                                    accumulate="MedHouseVal",
                                    id.column="id")
    
    # Print the result.
    print(TDGLMPredict_out$result)
    
    # Example 2: td_glm_predict_sqle() predicts the 'MedHouseVal' using
    #            generated regression model by td_glm_sqle and newdata.
    # Note that model is passed as  instance of  GLM to "object".
    TDGLMPredict_out1 <- td_glm_predict_sqle(
                                     object=answer,
                                     newdata=obj$result,
                                     accumulate="MedHouseVal",
                                     id.column="id")
    
    # Print the result.
    print(TDGLMPredict_out1$result)
    
    # Alternatively use S3 predict function to run predict on the output of
    # td_glm_sqle() function.
    TDGLMPredict_out1 <- predict(answer,
                                 newdata=obj$result,
                                 accumulate="MedHouseVal",
                                 id.column="id")
    
    # Print the result.
    print(TDGLMPredict_out1$result)