PMMLPredict | Supported External Model Types | Teradata Package for Python - PMMLPredict - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-02-17
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

PMML is the most popular standard serialization format for exchange of Machine Learning models. Most customers train their models in tools external to Vantage, such as Scikit-learn. Vantage Analytics enables customers to bring their models to Vantage (by inserting the model as a blob into a table) and apply them to data stored in Analytics Database for scoring. Users can use these external models for scoring through teradataml by using PMMLPredict() function.

The following are examples of PMMLPredict() function call.

Example Setup

  • Import necessary modules.
    >>> import os, teradataml
    >>> from teradataml.options.configure import configure
    >>> from teradataml import DataFrame, load_example_data, save_byom, retrieve_byom
  • Load example data.
    >>> load_example_data("byom", "iris_input")
  • Create teradataml DataFrame object.
    >>> iris_test = DataFrame("iris_input")
  • Set install location of the BYOM functions.
    >>> configure.byom_install_location = "mldb"

Example 1: Run a query with GLM model and overwrite cached models

  • Load model file into Vantage.
    >>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_db_glm_model.pmml")
  • Save the model.
    >>> save_byom("iris_db_glm_model", model_file, "byom_models")
  • Retrieve the model.
    >>> modeldata = retrieve_byom("iris_db_glm_model", table_name="byom_models")
  • Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
    >>> result = PMMLPredict(modeldata = modeldata,
                                   newdata = iris_test,
                                   accumulate = ['id', 'sepal_length', 'petal_length'],
                                   overwrite_cached_models = '*')

Example 2: Run a query with XGBoost model and overwrite cached models

  • Load model file into Vantage.
    >>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_db_xgb_model.pmml")
  • Save the model.
    >>> save_byom("iris_db_xgb_model", model_file, "byom_models")
  • Retrieve the model.
    >>> modeldata = retrieve_byom("iris_db_xgb_model", table_name="byom_models")
  • Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
    >>> result = PMMLPredict(modeldata = modeldata,
                                   newdata = iris_test,
                                   accumulate = ['id', 'sepal_length', 'petal_length'],
                                   overwrite_cached_models = '*')