This example shows metrics based early stopping for model trainer function XGBoost.
- Define hyperparameter space and GridSearch with XGBoost.
- Define model training parameters.
>>> model_params = {"input_columns":['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], ... "response_column" :'species', ... "max_depth":(5,10,15), ... "lambda1" :(1000.0,0.001), ... "model_type" :"Classification", ... "seed":32, ... "shrinkage_factor":0.1, ... "iter_num":(5, 50)}
- Define evaluation parameters.
>>> eval_params = {"id_column": "id", ... "accumulate": "species", ... "model_type":'Classification', ... "object_order_column":['task_index', 'tree_num', 'iter','class_num', 'tree_order'] ... }
- Import model trainer function and optimizer.
>>> from teradataml import XGBoost, RandomSearch
- Initialize the GridSearch optimizer with model trainer function and parameter space required for model training.
>>> rs_obj = RandomSearch(func=XGBoost, params=model_params, n_iter=10)
- Define model training parameters.
- Run hyperparameter tunning with early stop.This step runs fit with early_stop and evalation_metric.early_stop can be used without specifying an evaluation_metric.By default,
- In classification, the evaluation_metric is set to Accuracy.
- In regression, the evaluation_metric is set to MAE.
>>> rs_obj.fit(data=data, evaluation_metric = 'WEIGHTED-F1', early_stop=0.87, verbose=2, **eval_params)
Model_id:XGBOOST_0 - Run time:28.722s - Status:PASS - WEIGHTED-F1:0.901 Model_id:XGBOOST_3 - Run time:28.783s - Status:PASS - WEIGHTED-F1:0.935 Model_id:XGBOOST_1 - Run time:28.804s - Status:PASS - WEIGHTED-F1:0.901 Model_id:XGBOOST_2 - Run time:28.824s - Status:PASS - WEIGHTED-F1:0.901 Computing: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾| 40% - 4/10
Various evaluation metrics can be utilized with early stop to take advantage of the early stopping functionality. - View hyperparameter tuning trained model metadata using models and model_stats properties.
- View trained model using models property.
>>> rs_obj.models
MODEL_ID DATA_ID PARAMETERS STATUS WEIGHTED-F1 0 XGBOOST_0 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 0.901483 1 XGBOOST_4 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 2 XGBOOST_5 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 3 XGBOOST_6 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 4 XGBOOST_7 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 5 XGBOOST_8 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 6 XGBOOST_9 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... SKIP NaN 7 XGBOOST_3 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 0.935065 8 XGBOOST_1 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 0.901483 9 XGBOOST_2 DF_0 {'input_columns': ['sepal_length', 'sepal_widt... PASS 0.901483
The status "SKIP" indicates that the model was not trained, as the previously trained model before it was able to achieve the desired threshold or a better value of evaluation metrics.. - View additional performance metrics using model_stats property.
>>> rs_obj.model_stats
MODEL_ID ACCURACY MICRO-PRECISION MICRO-RECALL MICRO-F1 MACRO-PRECISION MACRO-RECALL MACRO-F1 WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 0 XGBOOST_0 0.900000 0.900000 0.900000 0.900000 0.879121 0.888889 0.879441 0.912088 0.900000 0.901483 1 XGBOOST_4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2 XGBOOST_5 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 3 XGBOOST_6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 4 XGBOOST_7 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 5 XGBOOST_8 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 6 XGBOOST_9 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 7 XGBOOST_3 0.933333 0.933333 0.933333 0.933333 0.916667 0.944444 0.922078 0.950000 0.933333 0.935065 8 XGBOOST_1 0.900000 0.900000 0.900000 0.900000 0.879121 0.888889 0.879441 0.912088 0.900000 0.901483 9 XGBOOST_2 0.900000 0.900000 0.900000 0.900000 0.879121 0.888889 0.879441 0.912088 0.900000 0.901483
- View trained model using models property.