PMMLPredict | Supported External Model Types | Teradata Package for Python - PMMLPredict - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Published
November 2021
Language
English (United States)
Last Update
2022-01-14
dita:mapPath
bol1585763678431.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
B700-4006
lifecycle
previous
Product Category
Teradata Vantage

PMML is the most popular standard serialization format for exchange of Machine Learning models. Most customers train their models in tools external to Vantage, such as Scikit-learn. Vantage Analytics enables customers to bring their models to Vantage (by inserting the model as a blob into a table) and apply them to data stored in Advanced SQL Engine for scoring. Users can use these external models for scoring through teradataml by using PMMLPredict() function.

The following are examples of PMMLPredict() function call.

Example Setup

  • Import necessary modules.
    >>> import os, teradataml
    >>> from teradataml.options.configure import configure
    >>> from teradataml import PMMLPredict, DataFrame, load_example_data, save_byom, retrieve_byom
  • Load example data.
    >>> load_example_data("byom", "iris_input")
  • Create teradataml DataFrame object.
    >>> iris_test = DataFrame("iris_input")
  • Set install location of the BYOM functions.
    >>> configure.byom_install_location = "mldb"

Example 1: Run a query with GLM model and overwrite cached models

  • Load model file into Vantage.
    >>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_db_glm_model.pmml")
  • Save the model.
    >>> save_byom("iris_db_glm_model", model_file, "byom_models")
  • Retrieve the model.
    >>> modeldata = retrieve_byom("iris_db_glm_model", table_name="byom_models")
  • Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
    >>> result = PMMLPredict(modeldata = modeldata,
                                   newdata = iris_test,
                                   accumulate = ['id', 'sepal_length', 'petal_length'],
                                   overwrite_cached_models = '*')

Example 2: Run a query with XGBoost model and overwrite cached models

  • Load model file into Vantage.
    >>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_db_xgb_model.pmml")
  • Save the model.
    >>> save_byom("iris_db_xgb_model", model_file, "byom_models")
  • Retrieve the model.
    >>> modeldata = retrieve_byom("iris_db_xgb_model", table_name="byom_models")
  • Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
    >>> result = PMMLPredict(modeldata = modeldata,
                                   newdata = iris_test,
                                   accumulate = ['id', 'sepal_length', 'petal_length'],
                                   overwrite_cached_models = '*')