Teradata Package for R Function Reference | 17.00 - 17.00 - td_log_reg_predict_valib - Teradata Package for R

Teradata® Package for R Function Reference

Product
Teradata Package for R
Release Number
17.00
Release Date
July 2021
Content Type
Programming Reference
Publication ID
B700-4007-090K
Language
English (United States)

Description

Logistic Regression function model can be passed to a Logistic Regression Scoring function to create a score output containing predicted values of the dependent variable.

Usage

td_log_reg_predict_valib(model, data, ...)

Arguments

model

Required Argument.
Specifies the input containing the logistic model to use in scoring. This must be the "model" tbl_teradata generated by td_log_reg_valib() or a tbl_teradata created on a table generated by 'logistic' function from Vantage Analytic Library.
Types: tbl_teradata

data

Required Argument.
Specifies the input data to score.
Types: tbl_teradata

...

Specifies other arguments supported by the function as described in the 'Other Arguments' section.

Value

Function returns an object of class "td_log_reg_predict_valib" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using name: result.

Other Arguments

estimate.column

Optional Argument.
Specifies the name of a column in the score output containing the estimated value of the dependent variable (column).
Notes:

  1. Either "estimate.column" or "prob.column" must be requested.

  2. If the estimate column is not unique in the score output, '_tm_' is automatically placed in front of the name.

Types: character

index.columns

Optional Argument.
Specifies the name(s) of the column(s) representing the primary index of the score output. By default, the primary index columns of the score output are the primary index columns of the input. In addition, the index columns need to form a unique key for the score output. Otherwise, there are more than one score for a given observation.
Types: character OR vector of Strings (character)

prob.column

Optional Argument.
Specifies the name of a column in the score output containing the probability that the dependent value is equal to the response value.
Notes:

  1. Either "estimate.column" or "prob.column" must be requested.

  2. If the probability column is not unique in the score output, '_tm_' is automatically placed in front of the name.

Types: character

accumulate

Optional Argument.
Specifies the name(s) of the column(s) from the input to retain in the output.
Types: character OR vector of Strings (character)

prob.threshold

Optional Argument.
Specifies the probability threshold value. When the probability of the dependent variable being 1 is greater than or equal to this value, the estimated value of the dependent variable is 1. If less than this value, the estimated value is 0.
Default Value: 0.5
Types: numeric

Examples

# Notes:
#   1. To execute Vantage Analytic Library functions, set option 'val.install.location' to
#      the database name where Vantage analytic library functions are installed.
#   2. Datasets used in these examples can be loaded using Vantage Analytic Library installer.

# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")

# Get remote data source connection.
con <- td_get_context()$connection

# Create an object of class "tbl_teradata".
df <- tbl(con, "customer")
print(df)

# Example 1: Shows how logistic model scoring can be performed.
# Generate a logistic model.
log_reg_obj <- td_log_reg_valib(data=df,
                                columns=c("age", "years_with_bank", "income"),
                                response.column="nbr_children",
                                response.value=0)
# Print the model.
print(log_reg_obj$model)


# Score using the model generated above.
obj <- td_log_reg_predict_valib(data=df,
                                model=log_reg_obj$model,
                                prob.column="Probability")
# Print the results.
print(obj$result)

# Score using S3 predict function and the model generated above.
obj <- predict(object=log_reg_obj,
               data=df,
               prob.column="Probability")
# Print the results.
print(obj$result)