Use the deploy method to deploy a model generated using execute_script() in database.
Required Arguments:
- model_column: Specifies the column name in which model is present.
Supported types of model in this column are CLOB and BLOB.
The column mentioned in this argument should be present in <script_obj>.result.
Optional Arguments:
- partition_columns: Specifies the columns on which data is partitioned.The columns mentioned in this argument should be present in <script_obj>.result.
- model_file_prefix: Specifies the prefix to be used to the generated model file.
If this argument is None, prefix is auto-generated.
When the argument model_column contains multiple models:- If partition_columns is None, model file prefix is appended with underscore(_) and numbers starting from one (1) to get model file names.
- If partition_columns is not None, model file prefix is appended with underscore(_) and unique values in partition_columns to generate model file names.
Examples Setup
- Load example data.
>>> load_example_data("openml", "multi_model_classification")
- Create teradataml DataFrame from the data.
>>> df = DataFrame("multi_model_classification")
- Install Script file.
>>> file_location = os.path.join(os.path.dirname(teradataml.__file__), "data", "scripts", "deploy_script.py")
>>> install_file("deploy_script", file_location, replace=True)
- Create variables needed for Script execution.
>>> script_command = '/opt/teradata/languages/Python/bin/python3 ./ALICE/deploy_script.py'
>>> partition_columns = ["partition_column_1", "partition_column_2"]
>>> columns = ["col1", "col2", "col3", "col4", "label", "partition_column_1", "partition_column_2"]
>>> returns = OrderedDict([("partition_column_1", INTEGER()), ("partition_column_2", INTEGER()), ("model", CLOB())])
- Run the Script.
>>> obj = Script(data=df.select(columns), script_command=script_command, data_partition_column=partition_columns, returns=returns)
>>> opt = obj.execute_script()
>>> opt
partition_column_1 partition_column_2 model 0 10 b'gAejc1.....drIr' 0 11 b'gANjcw.....qWIu' 1 10 b'abdwcd.....dWIz' 1 11 b'gA4jc4.....agfu'
Example 1: Provide only partition_columns argument
In this example, only partition_columns is provided, model_file_prefix is auto-generated.
>>> obj.deploy(model_column="model", partition_columns=["partition_column_1", "partition_column_2"])
['model_file_1710436227163427__0_10', 'model_file_1710436227163427__1_10', 'model_file_1710436227163427__0_11', 'model_file_1710436227163427__1_11']
Example 2: Provide only model_file_prefix argument
In this example, only model_file_prefix is provided, the file names are suffixed with 1, 2, 3, ... for multiple models.
>>> obj.deploy(model_column="model", model_file_prefix="my_prefix_new_")
['my_prefix_new__1', 'my_prefix_new__2', 'my_prefix_new__3', 'my_prefix_new__4']
Example 3: Neither partition_columns nor model_file_prefix argument is provided
>>> obj.deploy(model_column="model")
['model_file_1710438346528596__1', 'model_file_1710438346528596__2', 'model_file_1710438346528596__3', 'model_file_1710438346528596__4']
Example 4: Provide both partition_columns and model_file_prefix arguments
>>> obj.deploy(model_column="model", model_file_prefix="my_prefix_new_", partition_columns=["partition_column_1", "partition_column_2"])
['my_prefix_new__0_10', 'my_prefix_new__0_11', 'my_prefix_new__1_10', 'my_prefix_new__1_11']