H2OPredict | Supported External Model Types | Teradata Package for Python - H2OPredict

H2OPredict | Supported External Model Types | Teradata Package for Python - H2OPredict - Teradata Package for Python

Teradata® Package for Python User Guide

Product

Teradata Package for Python

Release Number

17.10

Published

May 2022

Language

English (United States)

Last Update

2022-08-18

dita:mapPath

rsu1641592952675.ditamap

dita:ditavalPath

ayr1485454803741.ditaval

dita:id

B700-4006

lifecycle

Product Category

Teradata Vantage

H2OPredict performs a prediction on each row of the input table using a model previously trained in H2O and then loaded into the database. The model uses an interchange format called MOJO and it is loaded as a blob to a table in Teradata database by the user.

The following are examples of H2OPredict() function call.

Example Setup

Import necessary modules.

>>> import os, teradataml

>>> from teradataml.options.configure import configure

>>> from teradataml import H2Oredict, DataFrame, load_example_data, save_byom, retrieve_byom

Load example data.

>>> load_example_data("byom", "iris_input")

Create teradataml DataFrame object.
```
>>> iris_test = DataFrame("iris_input")
```

Set install location of the BYOM functions.

>>> # Set install location of BYOM functions.
>>> configure.byom_install_location = "mldb"

Example 1: Run a query with GLM model and overwrite cached models

The query also includes arguments model_type, enable_options and model_output_fields.

Load model file into Vantage.

>>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_mojo_glm_h2o_model")

Save the model.

>>> save_byom("iris_mojo_glm_h2o_model", model_file, "byom_models")

Retrieve the model.

>>> modeldata = retrieve_byom("iris_mojo_glm_h2o_model", table_name="byom_models")

Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.

>>> result = H2OPredict(newdata=iris_test,
                        newdata_partition_column='id',
                        newdata_order_column='id',
                        modeldata=modeldata,
                        modeldata_order_column='model_id',
                        model_output_fields=['label', 'classProbabilities'],
                        accumulate=['id', 'sepal_length', 'petal_length'],
                        overwrite_cached_models='*',
                        enable_options='stageProbabilities',
                        model_type='OpenSource')

Print the results.
```
print(result.result)
```

Example 2: Run a query with XGBoost model and overwrite cached models

The query also includes arguments model_type, enable_options and model_output_fields.

Load model file into Vantage.

>>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_mojo_xgb_h2o_model")

Save the model.

>>> save_byom("iris_mojo_xgb_h2o_model", model_file, "byom_models")

Retrieve the model.

>>> modeldata = retrieve_byom("iris_mojo_xgb_h2o_model", table_name="byom_models")

Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.

>>> result = H2OPredict(newdata=iris_test,
                        newdata_partition_column='id',
                        newdata_order_column='id',
                        modeldata=modeldata,
                        modeldata_order_column='model_id',
                        model_output_fields=['label', 'classProbabilities'],
                        accumulate=['id', 'sepal_length', 'petal_length'],
                        overwrite_cached_models='*',
                        enable_options='stageProbabilities',
                        model_type='OpenSource')

Print the results.
```
print(result.result)
```

Example 3: Run a query with a licensed model

This example runs a query with a licensed model with id 'licensed_model1' from the table 'byom_licensed_models' and associated license key stored in column 'license_key' of the table 'license' present in the schema 'mldb'.

Retrieve model.

modeldata = retrieve_byom('licensed_model1',
                           table_name='byom_licensed_models',
                           license='license_key',
                           is_license_column=True,
                           license_table_name='license',
                           license_schema_name='mldb')

Pass the output of the retrieve_model API as an input to the H2OPredict function to score data.

result = H2OPredict(newdata=iris_test,
                    newdata_partition_column='id',
                    newdata_order_column='id',
                    modeldata=modeldata,
                    modeldata_order_column='model_id',
                    model_output_fields=['label', 'classProbabilities'],
                    accumulate=['id', 'sepal_length', 'petal_length'],
                    overwrite_cached_models='*',
                    enable_options='stageProbabilities',
                    model_type='OpenSource')

Print the results.
```
print(result.result)
```