Time based early stopping for Model Trainer function | GridSearch | teradataml - Example 5.1: Time Based Early Stopping for Model Trainer Function - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

This example shows time based early stopping for model trainer function XGBoost.

  1. Define hyperparameter space and GridSearch with XGBoost.
    1. Define model training parameters.
      >>> model_params = {"input_columns":['sepal_length', 'sepal_width', 'petal_length', 'petal_width'],
                          "response_column" :'species',
                          "max_depth":(5,10,15),
                          "lambda1" :(1000.0,0.001),
                          "model_type" :"Classification",
                          "seed":32,
                          "shrinkage_factor":0.1,
                          "iter_num":(5, 50)}
    2. Define evaluation parameters.
      >>> eval_params = {"id_column": "id",
      ...                "accumulate":"species",
      ...                "model_type":'Classification',
      ...                "object_order_column":['task_index', 'tree_num', 'iter','class_num', 'tree_order']
                        }
    3. Import model trainer function and optimizer.
      >>> from teradataml import XGBoost, GridSearch
    4. Initialize the GridSearch optimizer with model trainer function and parameter space required for model training.
      >>> gs_obj = GridSearch(func=XGBoost, params=model_params)
  2. Execute hyperparameter tunning with max time.
    This step fits the hyperparameters in parallel with max_time argument set to 30 seconds.
    >>> gs_obj.fit(data=data, max_time=30, verbose=2, **eval_params)
    Model_id:XGBOOST_2 - Run time:33.277s - Status:PASS - ACCURACY:0.933               
    Model_id:XGBOOST_3 - Run time:33.276s - Status:PASS - ACCURACY:0.933               
    Model_id:XGBOOST_0 - Run time:33.279s - Status:PASS - ACCURACY:0.967                
    Model_id:XGBOOST_1 - Run time:33.278s - Status:PASS - ACCURACY:0.933                
    Computing: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾| 33% - 4/12

    As shown in the output, four models are trained, as the maximum number of trained models in parallel is set to 4.

    Any additional models are skipped due to reaching the maximum time allowed for running hyperparameter tuning.

  3. View hyperparameter tuning model metadata using models and model_stats properties.
    1. View trained model using models property.
      >>> gs_obj.models
            MODEL_ID     DATA_ID                                       PARAMETERS       STATUS    ACCURACY
      0     XGBOOST_2     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     PASS     0.933333
      1     XGBOOST_4     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      2     XGBOOST_5     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      3     XGBOOST_6     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      4     XGBOOST_7     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      5     XGBOOST_8     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      6     XGBOOST_9     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      7     XGBOOST_10    DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      8     XGBOOST_11    DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     SKIP     NaN
      9     XGBOOST_3     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     PASS     0.933333
      10    XGBOOST_0     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     PASS     0.966667
      11    XGBOOST_1     DF_0     {'input_columns': ['sepal_length', 'sepal_widt...     PASS     0.933333
      The status "SKIP" indicates that the model was not trained due to reaching the maximum time limit.
    2. View additional performance metrics using model_stats property.
      >>> gs_obj.model_stats
          MODEL_ID   ACCURACY MICRO-PRECISION   MICRO-RECALL   MICRO-F1   MACRO-PRECISION   MACRO-RECALL   MACRO-F1   WEIGHTED-PRECISION   WEIGHTED-RECALL   WEIGHTED-F1
      0   XGBOOST_3   1.000   1.000   1.000   1.000   1.000   1.000   1.000   1.000   1.000   1.000
      1   XGBOOST_4   NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      2   XGBOOST_5   NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      3   XGBOOST_6   NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      4   XGBOOST_7   NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      5   XGBOOST_8   NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      6   XGBOOST_9   NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      7   XGBOOST_10  NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      8   XGBOOST_11  NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN     NaN
      9   XGBOOST_2   1.000   1.000   1.000   1.000   1.000   1.000   1.000   1.000   1.000   1.000
      10  XGBOOST_1   0.967   0.967   0.967   0.967   0.972   0.933   0.948   0.969   0.967   0.966
      11  XGBOOST_0   0.967   0.967   0.967   0.967   0.972   0.933   0.948   0.969   0.967   0.966