This example shows time based early stopping for model trainer function XGBoost.
- Define hyperparameter space and RandomSearch with XGBoost.
- Define model training parameters.
>>> model_params = {"input_columns":['sepal_length', 'sepal_width', 'petal_length', 'petal_width'], "response_column" : 'species', "max_depth":(5,10,15), "lambda1" : (1000.0,0.001), "model_type" :"Classification", "seed":32, "shrinkage_factor":0.1, "iter_num":(5, 50)}
- Define evaluation parameters.
>>> eval_params = {"id_column": "id", "accumulate": "species", "model_type":'Classification', "object_order_column":['task_index', 'tree_num', 'iter','class_num', 'tree_order'] }
- Import model trainer function and optimizer.
>>> from teradataml import XGBoost, RandomSearch
- Initialize the RandomSearch optimizer with model trainer function and parameter space required for model training.
>>> rs_obj = RandomSearch(func=XGBoost, params=model_params, n_iter=5)
- Define model training parameters.
- Execute hyperparameter tunning with max time.This step fits the hyperparameters in parallel with max_time argument set to 30 seconds.max_time argument can be used with sequential hyperparameter tunning.
>>> rs_obj.fit(data=data, max_time=30, verbose=2, **eval_params)
Model_id:XGBOOST_3 - Run time:28.292s - Status:PASS - ACCURACY:0.8 Model_id:XGBOOST_0 - Run time:28.291s - Status:PASS - ACCURACY:0.867 Model_id:XGBOOST_2 - Run time:28.289s - Status:PASS - ACCURACY:0.867 Model_id:XGBOOST_1 - Run time:28.291s - Status:PASS - ACCURACY:0.867 Computing: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾⫾| 80% - 4/5
As shown in the output, four models are trained, as the maximum number of trained models in parallel is set to 4.
Any additional models are skipped due to reaching the maximum time allowed for running hyperparameter tuning.
- View hyperparameter tuning model metadata using models and model_stats properties.
- View trained model using models property.
>>> rs_obj.models
MODEL_ID DATA_ID PARAMETERS STATUS ACCURACY 0 XGBOOST_3 DF_0 {'input_columns': ['sepal_length', 'sepal_widt PASS 0.800000 1 XGBOOST_4 DF_0 {'input_columns': ['sepal_length', 'sepal_widt SKIP NaN 2 XGBOOST_0 DF_0 {'input_columns': ['sepal_length', 'sepal_widt PASS 0.866667 3 XGBOOST_2 DF_0 {'input_columns': ['sepal_length', 'sepal_widt PASS 0.866667 4 XGBOOST_1 DF_0 {'input_columns': ['sepal_length', 'sepal_widt PASS 0.866667
The status "SKIP" indicates that the model was not trained due to reaching the maximum time limit. - View additional performance metrics using model_stats property.
>>> rs_obj.model_stats
MODEL_ID ACCURACY MICRO-PRECISION MICRO-RECALL MICRO-F1 MACRO-PRECISION MACRO-RECALL MACRO-F1 WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 0 XGBOOST_3 0.800000 0.800000 0.800000 0.800000 0.841270 0.766667 0.772339 0.831746 0.800000 0.789433 1 XGBOOST_4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 2 XGBOOST_0 0.866667 0.866667 0.866667 0.866667 0.904762 0.833333 0.833333 0.904762 0.866667 0.855556 3 XGBOOST_2 0.866667 0.866667 0.866667 0.866667 0.904762 0.833333 0.833333 0.904762 0.866667 0.855556 4 XGBOOST_1 0.866667 0.866667 0.866667 0.866667 0.904762 0.833333 0.833333 0.904762 0.866667 0.855556
- View trained model using models property.