This example shows metrics based early stopping for model trainer function XGBoost.
- Define hyperparameter space and GridSearch with XGBoost.
- Define model training parameters.
>>> model_params = {"input_columns":['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], "response_column" :'species', "max_depth":(5,10,15), "lambda1" :(1000.0,0.001), "model_type" :"Classification", "seed":32, "shrinkage_factor":0.1, "iter_num":(5, 50)}
- Define evaluation parameters.
>>> eval_params = {"id_column": "id", ... "accumulate":"species", ... "model_type":'Classification', ... "object_order_column":['task_index', 'tree_num', 'iter','class_num', 'tree_order'] }
- Import model trainer function and optimizer.
>>> from teradataml import XGBoost, GridSearch
- Initialize the GridSearch optimizer with model trainer function and parameter space required for model training.
>>> gs_obj = GridSearch(func=XGBoost, params=model_params)
- Define model training parameters.
- Execute hyperparameter tunning with early stop.This step runs fit with early_stop and evaluation metric.
>>> gs_obj.fit(data=data, evaluation_metric = 'Micro-f1', early_stop=0.9, verbose=2, **eval_params)
Model_id:XGBOOST_3 - Run time:32.816s - Status:PASS - MICRO-F1:1.0 Model_id:XGBOOST_2 - Run time:32.816s - Status:PASS - MICRO-F1:1.0 Model_id:XGBOOST_1 - Run time:32.816s - Status:PASS - MICRO-F1:0.967 Model_id:XGBOOST_0 - Run time:32.818s - Status:PASS - MICRO-F1:0.967 Computing: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾| 33% - 4/12
Various evaluation metrics can be utilized with early stop to take advantage of the early stopping functionality. - View hyperparameter tuning trained model metadata using models and model_stats properties.
- View trained model using models property.
>>> gs_obj.models
MODEL_ID DATA_ID PARAMETERS STATUS MICRO-F1 0 XGBOOST_3 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 1.000000 1 XGBOOST_4 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 2 XGBOOST_5 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 3 XGBOOST_6 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 4 XGBOOST_7 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 5 XGBOOST_8 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 6 XGBOOST_9 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 7 XGBOOST_10 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 8 XGBOOST_11 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 9 XGBOOST_2 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 1.000000 10 XGBOOST_1 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 0.966667 11 XGBOOST_0 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 0.966667
The status "SKIP" indicates that the model was not trained, as the previously trained model before it was able to achieve the desired threshold or a better value of evaluation metrics.. - View additional performance metrics using model_stats property.
>>> gs_obj.model_stats
MODEL_ID ACCURACY MICRO-PRECISION MICRO-RECALL MICRO-F1 MACRO-PRECISION MACRO-RECALL MACRO-F1 WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 0 XGBOOST_3 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1 XGBOOST_4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2 XGBOOST_5 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3 XGBOOST_6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 4 XGBOOST_7 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 5 XGBOOST_8 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 6 XGBOOST_9 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 7 XGBOOST_10 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 8 XGBOOST_11 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 9 XGBOOST_2 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 10 XGBOOST_1 0.967 0.967 0.967 0.967 0.972 0.933 0.948 0.969 0.967 0.966 11 XGBOOST_0 0.967 0.967 0.967 0.967 0.972 0.933 0.948 0.969 0.967 0.966
- View trained model using models property.