Use the BYOMPredictor.predict method to score data in Vantage with a model that has been created outside Vantage and exported to Vantage using the deploy method.
This method supports prediction using models in the following formats:
- PMML
- ONNX
- MOJO (H2O)
Required Arguments:
- input: Specifies the teradataml DataFrame containing the input test data.
- input_cols: Specifies the name(s) of input teradataml DataFrame column(s) to copy to the output.
Optional Arguments:
- **byom_kwargs: Specifies additional keyword arguments which PMMLPredict, H2OPredict, or ONNXPredict accept, includes the following keyword arguments:
- model_output_fields: Specifies the columns of the json output that the user wants to specify as individual columns instead of the entire json_report.
- overwrite_cached_models: Specifies the model name that needs to be removed from the cache. Default value is 'false'.This argument allows the following values: true, t, yes, y, 1, false, f, no, n, 0, *, current_cached_model.Use the value '*' to remove the models.
- modeldata_order_column: Specifies Order By columns for modeldata. Values to this argument can be provided as a list, if multiple columns are used for ordering.This argument is for PMML and H2O only.
- newdata_partition_column: Specifies Partition By columns for newdata. Values to this argument can be provided as a list, if multiple columns are used for partition.This argument is for PMML and H2O only.
- newdata_order_column: Specifies Order By columns for newdata. Values to this argument can be provided as a list, if multiple columns are used for ordering.This argument is for PMML and H2O only.
- model_type: Specifies the model type for H2O model prediction.
Permitted values include 'DAI', 'OpenSource' (default value).
This argument is for H2O only. - enable_options: Specifies the options to be enabled for H2O model prediction.
Permitted values include 'contributions', 'stageProbabilities', 'leafNodeAssignments'.
- show_model_input_fields_map: Specifies whether to show default or expanded model_input_fields_map based on input model for defaults or "model_input_fields_map" for expansion.
Default value is 'False'. When set to 'True', additional information is shared through output dataframe.
This argument is for ONNX only. - model_input_fields_map: Specifies the mapping of input columns to tensor input names.This argument is for ONNX only.
Examples
- Import necessary packages.
from tdapiclient import create_tdapi_context, TDApiClient
- Create the TDAPI context.
context = create_tdapi_context("azure", "/td-tables")
- Create TDApiClient object.
tdapiclient = TDApiClient(context)
- Create teradataml DataFrame.
train = DataFrame(tableName='train_data')
- Create a SKLearn model.
ScriptRunConfig takes all the parameters as required by Azure Machine Learning ScriptRunConfig.
skLearnObject = tdapiclient.ScriptRunConfig(arguments=[train])
- Train the model in Azure Machine Learning.
run = skLearnObject.fit(mount=True)
- Register the model in Azure Machine Learning.
model = run.register_model(model_name='example', model_path='outputs/example.pmml')
- Deploy model to Vantage.
model_predictor = skLearnObject.deploy(model, platform="vantage")
- Score model in Vantage, show "id" column of test data in output.
test = DataFrame(tableName='test_data')
model_predictor.predict(test, ['id'])