save_model() | Teradata Python Package - 17.00 - save_model() - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Release Date
November 2021
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)

The save_model() API allows a user to save model-related information to the Model Catalog, and also persist the model's output DataFrames to tables on Vantage (which would have otherwise been purged at the end of the session) without having to call the copy_to_sql() or the to_sql() DataFrame Method.

This allows the user to create a connection anytime later using a fresh teradataml (or even a SQL) session to use the saved models in workflows.

For example, a user can create a GLM model using the ML Engine GLM function (teradataml.analytics.mle.GLM.GLM), and then save this model by calling the save_model() function.

The save_model() function persists not only the model's result DataFrames, but also all the arguments to the model generating function and some other details as mentioned below, so that a user can understand how the model is generated:
  • The target column from the training data input to the model, when available.
    The target column is currently saved only for the functions that use formula, and when the formula is used, in case it is optional.
  • The prediction type for the function: CLASSIFICATION, REGRESSION, CLUSTERING, or OTHER.
  • The engine used to generate the model: ML Engine, or Advanced SQL Engine.
  • The client language used to generate the model: teradataml.
  • The name of the algorithm associated with the function.
  • The time in seconds required to run the model generating function.
  • The model creating user's name.
  • The status and access level of the model (see the publish_model() section).
  • The location where the model is stored: Advanced SQL Engine.
  • The date and time when the model is saved.
  • If possible, the name of the table corresponding to the DataFrames passed as input to the model generating function along with their dimensions.
  • The user provided model_project, entity_target, and performance_metrics.

Once the model is saved, all of this information is available in the output of describe_model().

A model can be saved only by the creator of the model.
The required arguments:
  • model specifies the teradataml analytic function model to be saved;
  • name specifies the unique name to identify the saved model;
  • description specifies a note describing the model to be saved.

The optional arguments model_project specifies the project that the model is associated with, and entity_target specifies a group or team that the model is associated with.

Another optional argument performance_metrics specifies the performance metrics for the model, as a dictionary of the form { "<metric>" : { "measure" : <value> }, ... }. For example: { "AUC" : { "measure" : 0.5 }, ... }.

Example

  • Load the data.
    >>> # Load the data to run the example
    >>> load_example_data("decisionforest", ["housing_train"])
  • Create teradataml DataFrame.
    >>> # Create teradataml DataFrame objects.
    >>> housing_train = DataFrame.from_table("housing_train")
  • Create a classification tree that can be input to the DecisionForestPredict.
    >>> # This example uses home sales data to create a
    >>> # classification tree that predicts home style, which can be input
    >>> # to the DecisionForestPredict.
     
    >>> formula = "homestyle ~ driveway + recroom + fullbase + gashw + airco + prefarea + price + lotsize + bedrooms + bathrms + stories + garagepl"
    >>> rft_model = DecisionForest(data=housing_train,
                                   formula = formula,
                                   tree_type="classification",
                                   ntree=50,
                                   tree_size=100,
                                   nodesize=1,
                                   variance=0.0,
                                   max_depth=12,
                                   maxnum_categorical=20,
                                   mtry=3,
                                   mtry_seed=100,
                                   seed=100
                                   )
  • Save the generated model.
    >>> # Let's save this generated model.
    >>> save_model(model=rft_model, name="decision_forest_model", description="Decision Forest test")
    Persisting model information.
    Persisted table: "ALICE"."ml__td_decisionforest0_1589787736719763"
    Persisted table: "ALICE"."ml__td_decisionforest1_1589794611679956"
    Persisted table: "ALICE"."ml__td_sqlmr_out__1589787498246673"
    Successfully persisted model.

    As the output message suggests, the model is saved successfully.