H2OPredict performs a prediction on each row of the input table using a model previously trained in H2O and then loaded into the database. The model uses an interchange format called MOJO and it is loaded as a blob to a table in Teradata database by the user.
The following are examples of H2OPredict() function call.
Example Setup
- Import necessary modules.
>>> import os, teradataml
>>> from teradataml.options.configure import configure
>>> from teradataml import H2Oredict, DataFrame, load_example_data, save_byom, retrieve_byom
- Load example data.
>>> load_example_data("byom", "iris_input")
- Create teradataml DataFrame object.
>>> iris_test = DataFrame("iris_input")
- Set install location of the BYOM functions.
>>> # Set install location of BYOM functions. >>> configure.byom_install_location = "mldb"
Example 1: Run a query with GLM model and overwrite cached models
The query also includes arguments model_type, enable_options and model_output_fields.
- Load model file into Vantage.
>>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_mojo_glm_h2o_model")
- Save the model.
>>> save_byom("iris_mojo_glm_h2o_model", model_file, "byom_models")
- Retrieve the model.
>>> modeldata = retrieve_byom("iris_mojo_glm_h2o_model", table_name="byom_models")
- Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
>>> result = H2OPredict(newdata=iris_test, newdata_partition_column='id', newdata_order_column='id', modeldata=modeldata, modeldata_order_column='model_id', model_output_fields=['label', 'classProbabilities'], accumulate=['id', 'sepal_length', 'petal_length'], overwrite_cached_models='*', enable_options='stageProbabilities', model_type='OpenSource')
Example 2: Run a query with XGBoost model and overwrite cached models
The query also includes arguments model_type, enable_options and model_output_fields.
- Load model file into Vantage.
>>> model_file = os.path.join(os.path.dirname(teradataml.__file__), "data", "models", "iris_mojo_xgb_h2o_model")
- Save the model.
>>> save_byom("iris_mojo_xgb_h2o_model", model_file, "byom_models")
- Retrieve the model.
>>> modeldata = retrieve_byom("iris_mojo_xgb_h2o_model", table_name="byom_models")
- Pass the output of the retrieve_model API as an input to the PMMLPredict function to score data.
>>> result = H2OPredict(newdata=iris_test, newdata_partition_column='id', newdata_order_column='id', modeldata=modeldata, modeldata_order_column='model_id', model_output_fields=['label', 'classProbabilities'], accumulate=['id', 'sepal_length', 'petal_length'], overwrite_cached_models='*', enable_options='stageProbabilities', model_type='OpenSource')