BYOMPredictor.predict | teradataml Azure Extension Library | API Integration - BYOMPredictor.predict Method - Teradata Vantage

Teradata Vantageā„¢ - API Integration Guide for Cloud Machine Learning

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Vantage
Release Number
1.4
Published
September 2023
Language
English (United States)
Last Update
2023-09-28
dita:mapPath
mgu1643999543506.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
mgu1643999543506

Use the BYOMPredictor.predict method to score data in Vantage with a model that has been created outside Vantage and exported to Vantage using the deploy method.

This method supports prediction using models in the following formats:
  • PMML
  • ONNX
  • MOJO (H2O)
Required Arguments:
  • input: Specifies the teradataml DataFrame containing the input test data.
  • input_cols: Specifies the name(s) of input teradataml DataFrame column(s) to copy to the output.
Optional Arguments:
  • **byom_kwargs: Specifies additional keyword arguments which PMMLPredict, H2OPredict, or ONNXPredict accept, includes the following keyword arguments:
    • model_output_fields: Specifies the columns of the json output that the user wants to specify as individual columns instead of the entire json_report.
    • overwrite_cached_models: Specifies the model name that needs to be removed from the cache. Default value is 'false'.
      This argument allows the following values: true, t, yes, y, 1, false, f, no, n, 0, *, current_cached_model.
      Use the value '*' to remove the models.
    • modeldata_order_column: Specifies Order By columns for modeldata. Values to this argument can be provided as a list, if multiple columns are used for ordering.
      This argument is for PMML and H2O only.
    • newdata_partition_column: Specifies Partition By columns for newdata. Values to this argument can be provided as a list, if multiple columns are used for partition.
      This argument is for PMML and H2O only.
    • newdata_order_column: Specifies Order By columns for newdata. Values to this argument can be provided as a list, if multiple columns are used for ordering.
      This argument is for PMML and H2O only.
    • model_type: Specifies the model type for H2O model prediction.

      Permitted values include 'DAI', 'OpenSource' (default value).

      This argument is for H2O only.
    • enable_options: Specifies the options to be enabled for H2O model prediction.

      Permitted values include 'contributions', 'stageProbabilities', 'leafNodeAssignments'.

    • show_model_input_fields_map: Specifies whether to show default or expanded model_input_fields_map based on input model for defaults or "model_input_fields_map" for expansion.

      Default value is 'False'. When set to 'True', additional information is shared through output dataframe.

      This argument is for ONNX only.
    • model_input_fields_map: Specifies the mapping of input columns to tensor input names.
      This argument is for ONNX only.

Examples

  • Import necessary packages.
    from tdapiclient import create_tdapi_context, TDApiClient
  • Create the TDAPI context.
    context = create_tdapi_context("azure", "/td-tables")
  • Create TDApiClient object.
    tdapiclient = TDApiClient(context)
  • Create teradataml DataFrame.
    train = DataFrame(tableName='train_data')
  • Create a SKLearn model.

    ScriptRunConfig takes all the parameters as required by Azure Machine Learning ScriptRunConfig.

    skLearnObject = tdapiclient.ScriptRunConfig(arguments=[train])
  • Train the model in Azure Machine Learning.
    run = skLearnObject.fit(mount=True)
  • Register the model in Azure Machine Learning.
    model = run.register_model(model_name='example', model_path='outputs/example.pmml')
  • Deploy model to Vantage.
    model_predictor = skLearnObject.deploy(model, platform="vantage")
    
  • Score model in Vantage, show "id" column of test data in output.
    test = DataFrame(tableName='test_data')
    model_predictor.predict(test, ['id'])