PMMLPredict | Supported External Model Types | Teradata Package for Python - 17.00 - PMMLPredict - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Release Date
November 2021
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)

PMML is the most popular standard serialization format for exchange of Machine Learning models. Most customers train their models in tools external to Vantage, such as Scikit-learn. Vantage Analytics enables customers to bring their models to Vantage (by inserting the model as a blob into a table) and apply them to data stored in Advanced SQL Engine for scoring. Users can use these external models for scoring through teradataml by using PMMLPredict() function.

The following are examples of PMMLPredict() function call.

Example Setup

  • Import necessary modules.
    >>> import os, teradataml
    >>> from teradataml.options.configure import configure
    >>> from teradataml import PMMLPredict, DataFrame, load_example_data, save_byom, retrieve_byom
  • Load example data.
    >>> load_example_data("byom", "iris_input")
  • Create teradataml DataFrame object.
    >>> iris_test = DataFrame("iris_input")
  • Set install location of the BYOM functions.
    >>> configure.byom_install_location = "mldb"

Example 1: Run a query with GLM model and overwrite cached models

  • Load model file into Vantage.
    >>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_db_glm_model.pmml")
  • Save the model.
    >>> save_byom("iris_db_glm_model", model_file, "byom_models")
  • Retrieve the model.
    >>> modeldata = retrieve_byom("iris_db_glm_model", table_name="byom_models")
  • Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
    >>> result = PMMLPredict(modeldata = modeldata,
                                   newdata = iris_test,
                                   accumulate = ['id', 'sepal_length', 'petal_length'],
                                   overwrite_cached_models = '*')

Example 2: Run a query with XGBoost model and overwrite cached models

  • Load model file into Vantage.
    >>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_db_xgb_model.pmml")
  • Save the model.
    >>> save_byom("iris_db_xgb_model", model_file, "byom_models")
  • Retrieve the model.
    >>> modeldata = retrieve_byom("iris_db_xgb_model", table_name="byom_models")
  • Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
    >>> result = PMMLPredict(modeldata = modeldata,
                                   newdata = iris_test,
                                   accumulate = ['id', 'sepal_length', 'petal_length'],
                                   overwrite_cached_models = '*')