Metrics Based Early Stopping for Model Trainer | RandomSearch | teradataml - Example 5.3: Metrics Based Early Stopping for Model Trainer Function - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
March 2024
Language
English (United States)
Last Update
2024-04-09
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

This example shows metrics based early stopping for model trainer function XGBoost.

  1. Define hyperparameter space and GridSearch with XGBoost.
    1. Define model training parameters.
      >>> model_params = {"input_columns":['sepal_length', 'sepal_width', 'petal_length', 'petal_width'],
      ...                 "response_column" :'species',
      ...                 "max_depth":(5,10,15),
      ...                 "lambda1" :(1000.0,0.001),
      ...                 "model_type" :"Classification",
      ...                 "seed":32,
      ...                 "shrinkage_factor":0.1,
      ...                 "iter_num":(5, 50)}
    2. Define evaluation parameters.
      >>> eval_params = {"id_column": "id",
      ...                "accumulate": "species",
      ...                "model_type":'Classification',
      ...                "object_order_column":['task_index', 'tree_num', 'iter','class_num', 'tree_order']
      ...                }
    3. Import model trainer function and optimizer.
      >>> from teradataml import XGBoost,  RandomSearch
    4. Initialize the GridSearch optimizer with model trainer function and parameter space required for model training.
      >>> rs_obj = RandomSearch(func=XGBoost, params=model_params, n_iter=10)
  2. Run hyperparameter tunning with early stop.
    This step runs fit with early_stop and evalation_metric.
    early_stop can be used without specifying an evaluation_metric.
    By default,
    • In classification, the evaluation_metric is set to Accuracy.
    • In regression, the evaluation_metric is set to MAE.
    >>> rs_obj.fit(data=data, evaluation_metric = 'WEIGHTED-F1', early_stop=0.87, verbose=2, **eval_params)
    Model_id:XGBOOST_0 - Run time:28.722s - Status:PASS - WEIGHTED-F1:0.901            
    Model_id:XGBOOST_3 - Run time:28.783s - Status:PASS - WEIGHTED-F1:0.935             
    Model_id:XGBOOST_1 - Run time:28.804s - Status:PASS - WEIGHTED-F1:0.901             
    Model_id:XGBOOST_2 - Run time:28.824s - Status:PASS - WEIGHTED-F1:0.901             
    Computing: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾| 40% - 4/10
    Various evaluation metrics can be utilized with early stop to take advantage of the early stopping functionality.
  3. View hyperparameter tuning trained model metadata using models and model_stats properties.
    1. View trained model using models property.
      >>> rs_obj.models
           MODEL_ID    DATA_ID                                        PARAMETERS    STATUS   WEIGHTED-F1
      0    XGBOOST_0    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    PASS    0.901483
      1    XGBOOST_4    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    SKIP    NaN
      2    XGBOOST_5    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    SKIP    NaN
      3    XGBOOST_6    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    SKIP    NaN
      4    XGBOOST_7    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    SKIP    NaN
      5    XGBOOST_8    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    SKIP    NaN
      6    XGBOOST_9    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    SKIP    NaN
      7    XGBOOST_3    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    PASS    0.935065
      8    XGBOOST_1    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    PASS    0.901483
      9    XGBOOST_2    DF_0    {'input_columns': ['sepal_length', 'sepal_widt...    PASS    0.901483
      The status "SKIP" indicates that the model was not trained, as the previously trained model before it was able to achieve the desired threshold or a better value of evaluation metrics..
    2. View additional performance metrics using model_stats property.
      >>> rs_obj.model_stats
          MODEL_ID    ACCURACY    MICRO-PRECISION    MICRO-RECALL    MICRO-F1    MACRO-PRECISION    MACRO-RECALL    MACRO-F1    WEIGHTED-PRECISION    WEIGHTED-RECALL    WEIGHTED-F1
      0    XGBOOST_0    0.900000    0.900000    0.900000    0.900000    0.879121    0.888889    0.879441    0.912088    0.900000    0.901483
      1    XGBOOST_4    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
      2    XGBOOST_5    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
      3    XGBOOST_6    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
      4    XGBOOST_7    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
      5    XGBOOST_8    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
      6    XGBOOST_9    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
      7    XGBOOST_3    0.933333    0.933333    0.933333    0.933333    0.916667    0.944444    0.922078    0.950000    0.933333    0.935065
      8    XGBOOST_1    0.900000    0.900000    0.900000    0.900000    0.879121    0.888889    0.879441    0.912088    0.900000    0.901483
      9    XGBOOST_2    0.900000    0.900000    0.900000    0.900000    0.879121    0.888889    0.879441    0.912088    0.900000    0.901483