Teradata Package for Python Function Reference | 17.10 - PMMLPredict - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference
- Product
- Teradata Package for Python
- Release Number
- 17.10
- Published
- April 2022
- Language
- English (United States)
- Last Update
- 2022-08-19
- lifecycle
- previous
- Product Category
- Teradata Vantage
- teradataml.analytics.byom.PMMLPredict.__init__ = __init__(self, modeldata=None, newdata=None, accumulate=None, model_output_fields=None, overwrite_cached_models=None, newdata_partition_column='ANY', newdata_order_column=None, modeldata_order_column=None)
- DESCRIPTION:
This function is used to score data in Vantage with a model that has been
created outside Vantage and exported to Vantage using PMML format.
PARAMETERS:
modeldata:
Required Argument.
Specifies the model teradataml DataFrame to be used for scoring.
modeldata_order_column:
Optional Argument.
Specifies Order By columns for "modeldata".
Values to this argument can be provided as a list, if multiple
columns are used for ordering.
Types: str OR list of Strings (str)
newdata:
Required Argument.
Specifies the input teradataml DataFrame that contains the data to be scored.
newdata_partition_column:
Optional Argument.
Specifies Partition By columns for "newdata".
Values to this argument can be provided as a list, if multiple
columns are used for partition.
Default Value: ANY
Types: str OR list of Strings (str)
newdata_order_column:
Optional Argument.
Specifies Order By columns for "newdata".
Values to this argument can be provided as a list, if multiple
columns are used for ordering.
Types: str OR list of Strings (str)
accumulate:
Required Argument.
Specifies the names of the input columns from "newdata" DataFrame
to copy to the output DataFrame.
Types: str OR list of Strings (str)
model_output_fields:
Optional Argument.
Specifies the columns of the json output that the user wants to
specify as individual columns instead of the entire json report.
Types: str OR list of strs
overwrite_cached_models:
Optional Argument.
Specifies the model name that needs to be removed from the cache.
Use * to remove all cached models.
Types: str OR list of strs
RETURNS:
Instance of PMMLPredict.
Output teradataml DataFrames can be accessed using attribute
references, such as PMMLPredictObj.<attribute_name>.
Output teradataml DataFrame attribute name is:
result
RAISES:
TeradataMlException, TypeError, ValueError
EXAMPLES:
# Create following table on vantage.
# CREATE SET TABLE pmml_models(
# model_id VARCHAR(40),
# model BLOB)
# PRIMARY INDEX (model_id);
# To load a PMML model into a Vantage table, run following query through Teradata
# Studio or Basic Teradata Query (BTEQ). 'load_pmml_models.txt' and model (.pmml)
# files can be found under 'data/scripts' in teradataml installation directory.
# This text file and the PMML models to be loaded should be in the same directory.
# .import vartext file load_pmml_models.txt
# .repeat *
# USING (c1 VARCHAR(40), c2 BLOB AS DEFERRED BY NAME)
# INSERT INTO pmml_models(:c1, :c2);
from teradataml import PMMLPredict, DataFrame, load_example_data, create_context
# Load example data.
load_example_data("byom", "iris_test")
# Create teradataml DataFrame objects.
iris_test = DataFrame.from_table("iris_test")
# Example 1: This example runs a query with GLM model and
# "overwrite_cached_models". This will erase entire cache.
modeldata = DataFrame.from_query("select * from pmml_models where model_id='iris_db_glm_model'")
result = PMMLPredict(
modeldata = modeldata,
newdata = iris_test,
accumulate = ['id', 'sepal_length', 'petal_length'],
overwrite_cached_models = '*',
)
# Print the results.
print(result.result)
# Example 2: This example runs a query with XGBoost model and
# "overwrite_cached_models". This will erase entire cache.
modeldata = DataFrame.from_query("select * from pmml_models where model_id='iris_db_xgb_model'")
result = PMMLPredict(
modeldata = modeldata,
newdata = iris_test,
accumulate = ['id', 'sepal_length', 'petal_length'],
overwrite_cached_models = '*',
)
# Print the results.
print(result.result)