Teradata Package for R Function Reference | 17.00 - PMMLPredict - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Product

Teradata Package for R

Release Number

17.00

Published

July 2021

Language

English (United States)

Last Update

2023-08-08

dita:id

B700-4007

NMT

Product Category

Teradata Vantage

PMMLPredict

Description

This function is used to score data in Vantage with a model that has been created outside Vantage and exported to vantage using PMML format.

Usage

  td_pmml_predict_sqle (
      modeldata = NULL,
      newdata = NULL,
      accumulate = NULL,
      model.output.fields = NULL,
      overwrite.cached.models = NULL,
      newdata.partition.column = "ANY",
      newdata.order.column = NULL,
      modeldata.order.column = NULL
  )

Arguments

`modeldata`	Required Argument. Specifies the model tbl_teradata to be used for scoring.
`modeldata.order.column`	Optional Argument. Specifies Order By columns for "modeldata". Values to this argument can be provided as a vector, if multiple columns are used for ordering. Types: character OR vector of Strings (character)
`newdata`	Required Argument. Specifies the input tbl_teradata that contains the data to be scored.
`newdata.partition.column`	Optional Argument. Specifies Partition By columns for "newdata". Values to this argument can be provided as a vector, if multiple columns are used for partition. Default Value: ANY Types: character OR vector of Strings (character)
`newdata.order.column`	Optional Argument. Specifies Order By columns for "newdata". Values to this argument can be provided as a vector, if multiple columns are used for ordering. Types: character OR vector of Strings (character)
`accumulate`	Required Argument. Specifies the names of "newdata" columns to copy to the output tbl_teradata. Types: character OR vector of Strings (character)
`model.output.fields`	Optional Argument. Specifies the columns of the json output that the user wants to specify as individual columns instead of the entire json report. Types: character OR vector of characters
`overwrite.cached.models`	Optional Argument. Specifies the model name that needs to be removed from the cache. Use * to remove all cached models. Types: character OR vector of characters

Value

Function returns an object of class "td_pmml_predict_sqle" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using the name: result.

Examples

  
    # Get the current context/connection.
    con <- td_get_context()$connection
    
    # Create following table on vantage. 
    crt_tbl <- "CREATE SET TABLE pmml_models(model_id VARCHAR(40), model BLOB) 
                PRIMARY INDEX (model_id);"
    DBI::dbExecute(con, sql(crt_tbl))
    
    # Run the following query through BTEQ or Teradata Studio to load the 
    # models. 'load_pmml_model.txt' and pmml files can be found under 
    # 'inst/scripts' in tdplyr installation directory. This file and the pmml 
    # models to be loaded should be in the same directory.  
    
    # .import vartext file load_pmml_model.txt
    # .repeat *
    # USING (c1 VARCHAR(40), c2 BLOB AS DEFERRED BY NAME) INSERT INTO pmml_models(:c1, :c2);
    
    # Load example data.
    loadExampleData("pmmlpredict_example", "iris_train", "iris_test")
    
    # Create object(s) of class "tbl_teradata".
    iris_train <- tbl(con, "iris_train")
    iris_test <- tbl(con, "iris_test")
    
    # Example 1 - 
    # This example runs a query with XGBoost model with no prediction values.
    # It also uses "overwrite.cached.models" argument.
    modeldata <- tbl(con, "pmml_models") 
    ml_name <- "iris_db_xgb_model"  
    pmml_predict_out <- td_pmml_predict(modeldata = modeldata, 
                                        newdata = iris_test, 
                                        accumulate = "id", 
                                        overwrite.cached.models = ml_name)
    
    # Example 2 - 
    # This example runs a query with RandomForest model with prediction values.
    # It also used "model.output.fields" argument.
    modeldata <- tbl(con, "pmml_models") 
    ml_op_field <- c('probability_0', 'probability_1', 'probability_2')
    pmml_predict_out <- td_pmml_predict(modeldata = modeldata, 
                                        newdata = iris_test, 
                                        accumulate = "id", 
                                        model.output.fields = ml_op_field)
    
    # Example 3 - 
    # This example runs a query with XGBoost model and 
    # "overwrite.cached.models". This will erase entire cache.
    modeldata <- tbl(con, "pmml_models") 
    pmml_predict_out <- td_pmml_predict(modeldata = modeldata, 
                                        newdata = iris_test, 
                                        accumulate = "id", 
                                        overwrite.cached.models = "*")
    # Example 4 -
    # This example assumes that the user is connected to another database where
    # byom is not installed and runs a query with XGBoost model with no prediction
    # values. It also uses "overwrite.cached.models" argument.
    # Set the global option in order to point to the database (mldb in this case)
    # where byom is installed.
    options(byom.install.location="mldb")
    modeldata <- tbl(con, "pmml_models") 
    ml_name <- "iris_db_xgb_model"
    pmml_predict_out <- td_pmml_predict(modeldata = modeldata,
                                        newdata = iris_test,
                                        accumulate = "id",
                                        overwrite.cached.models = ml_name)