Teradata R Package Function Reference - 16.20 - GLMPredict - Teradata R Package

Teradata® R Package Function Reference

prodname
Teradata R Package
vrm_release
16.20
created_date
February 2020
category
Programming Reference
featnum
B700-4007-098K

Description

The GLMPredict function uses the model generated by the function GLM to perform generalized linear model prediction on new input data.

Usage

td_glm_predict_sqle (
      modeldata = NULL,
      newdata = NULL,
      terms = NULL,
      family = NULL,
      linkfunction = "CANONICAL")
  
## S3 method for class 'td_glm_mle'
predict(
      modeldata = NULL, 
      newdata = NULL, 
      terms = NULL,
      family = NULL, 
      linkfunction = "CANONICAL")

Arguments

modeldata

Required Argument.
Specifies the name of the object that contains the model which is the output of function td_glm_mle. For td_glm_predict_sqle, this can also be the tibble containing the coefficients of glm model.

newdata

Required Argument.
Specifies the table containing the input data.

terms

Optional Argument.
Specifies the names of input table columns to copy to the output table.

family

Optional Argument.
Specifies the distribution exponential family. The default value is read from model table. If you specify this argument, you must give it the same value that you used for the Family argument of the function when you generated the model table.
Permitted Values: LOGISTIC, BINOMIAL, POISSON, GAUSSIAN, GAMMA, INVERSE_GAUSSIAN, NEGATIVE_BINOMIAL

linkfunction

Optional Argument.
The canonical link functions (default link functions) and the link functions that are allowed for each exponential family.
Note: Use the same value that you used for the linkfunction argument of the function td_glm_mle when you generated the model table.
Default Value: "CANONICAL"
Permitted Values: CANONICAL, IDENTITY, INVERSE, LOG, COMPLEMENTARY_LOG_LOG, SQUARE_ROOT, INVERSE_MU_SQUARED, LOGIT, PROBIT, CAUCHIT

Value

Function returns an object of class "td_glm_predict_sqle" which is a named list containing Teradata tbl object.
Named list member can be referenced directly with the "$" operator using name: result

Examples

    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("glm_example", "admissions_train", "housing_train")
    loadExampleData("glmpredict_example", "admissions_test", "housing_test")
    
    # Create remote tibble objects.
    admissions_test <- tbl(con, "admissions_test")
    admissions_train <- tbl(con, "admissions_train")
    housing_test <- tbl(con, "housing_test")
    housing_train <- tbl(con, "housing_train")
    
    # Example 1 -
    # First train the data, i.e., create a GLM Model
    td_glm_out <- td_glm_mle(formula = (admitted ~ stats + masters + gpa + programming),
                         family = "LOGISTIC",
                         linkfunction = "LOGIT",
                         data = admissions_train,
                         weights = "1",
                         threshold = 0.01,
                         maxit = 25,
                         step = FALSE,
                         intercept = TRUE
                         )
    
    # Run predict on the output of GLM
    td_glm_predict_out1 <- td_glm_predict_sqle(modeldata = td_glm_out,
                             newdata = admissions_test,
                             terms = c("id","masters","gpa","stats","programming","admitted"),
                             family = "LOGISTIC",
                             linkfunction = "LOGIT"
                             )
    
    # Example 2 -
    # First train the data, i.e., create a GLM Model
    td_glm_out_hs <- td_glm_mle(formula = (price  ~ recroom  + lotsize  + stories  + garagepl + gashw
                                       + bedrooms  + driveway  + airco  + homestyle + bathrms  + fullbase
                                       + prefarea),
                            family = "GAUSSIAN",
                            linkfunction = "IDENTITY",
                            data = housing_train,
                            weights = "1",
                            threshold = 0.01,
                            maxit = 25,
                            step = FALSE,
                            intercept = TRUE
                            )
    
    # Run predict on the output of GLM by passing coefficients
    td_glm_predict_out2 <- td_glm_predict_sqle(modeldata = td_glm_out_hs$coefficients,
                             newdata = housing_test,
                             terms = c("sn", "price"),
                             family = "GAUSSIAN",
                             linkfunction = "CANONICAL"
                             )
    
    # Alternatively use S3 predict method to find predictions.
    td_glm_predict_out3 <- predict(td_glm_out_hs,
                             newdata = housing_test,
                             terms = c("sn", "price"),
                             family = "GAUSSIAN",
                             linkfunction = "CANONICAL"
                             )