This example runs AutoML to obtain the best models, deploy them in the database, and subsequently load the models. Utilize the loaded models to predict the survival of passengers aboard the RMS Titanic based on various factors and assess the performance of the model.
- These methods are applicable for all three APIs, i.e., AutoML, AutoRegressor, AutoClassifer, AutoFraud, and AutoChurn.
- Use early stopping timer to 360 sec.
- Opt for verbose level 2 to get detailed log.
- Load the titanic data and create teradataml DataFrame.
>>> load_example_data('teradataml','titanic')>>> df = DataFrame('titanic') - Create an AutoML instance.
>>> aml = AutoML(task_type="Classification", >>> max_runtime_secs=360, >>> verbose=2)
- Fit the data.
>>> aml.fit(df, df.survived)
2025-11-04 03:56:42,064 | INFO | Feature Exploration started 2025-11-04 03:56:42,064 | INFO | Data Overview: 2025-11-04 03:56:42,106 | INFO | Total Rows in the data: 891 2025-11-04 03:56:42,148 | INFO | Total Columns in the data: 12 2025-11-04 03:56:42,737 | INFO | Column Summary: ColumnName Datatype NonNullCount NullCount BlankCount ZeroCount PositiveCount NegativeCount NullPercentage NonNullPercentage 0 name VARCHAR(1000) CHARACTER SET LATIN 891 0 0.0 NaN NaN NaN 0.000000 100.000000 1 ticket VARCHAR(20) CHARACTER SET LATIN 891 0 0.0 NaN NaN NaN 0.000000 100.000000 2 fare FLOAT 891 0 NaN 15.0 876.0 0.0 0.000000 100.000000 3 cabin VARCHAR(20) CHARACTER SET LATIN 204 687 0.0 NaN NaN NaN 77.104377 22.895623 4 passenger INTEGER 891 0 NaN 0.0 891.0 0.0 0.000000 100.000000 5 sibsp INTEGER 891 0 NaN 608.0 283.0 0.0 0.000000 100.000000 6 sex VARCHAR(20) CHARACTER SET LATIN 891 0 0.0 NaN NaN NaN 0.000000 100.000000 7 parch INTEGER 891 0 NaN 678.0 213.0 0.0 0.000000 100.000000 8 embarked VARCHAR(20) CHARACTER SET LATIN 889 2 0.0 NaN NaN NaN 0.224467 99.775533 9 age INTEGER 714 177 NaN 7.0 707.0 0.0 19.865320 80.134680 10 pclass INTEGER 891 0 NaN 0.0 891.0 0.0 0.000000 100.000000 11 survived INTEGER 891 0 NaN 549.0 342.0 0.0 0.000000 100.000000 2025-11-04 03:56:43,425 | INFO | Statistics of Data: ATTRIBUTE StatName StatValue 0 age MAXIMUM 80.000000 1 age STANDARD DEVIATION 14.536483 2 age PERCENTILES(25) 20.000000 3 age PERCENTILES(50) 28.000000 4 passenger COUNT 891.000000 5 passenger MINIMUM 1.000000 6 survived COUNT 891.000000 7 survived MINIMUM 0.000000 8 survived MAXIMUM 1.000000 9 survived MEAN 0.383838 2025-11-04 03:56:43,572 | INFO | Categorical Columns with their Distinct values: ColumnName DistinctValueCount name 891 sex 2 ticket 681 cabin 147 embarked 3 2025-11-04 03:56:45,873 | INFO | Futile columns in dataset: ColumnName 0 ticket 1 name 2025-11-04 03:56:49,151 | INFO | Columns with outlier percentage :- ColumnName OutlierPercentage 0 fare 13.019080 1 parch 23.905724 2 sibsp 5.162738 3 age 20.763187 1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation 2025-11-04 03:56:49,477 | INFO | Feature Engineering started ... 2025-11-04 03:56:49,477 | INFO | Handling duplicate records present in dataset ... 2025-11-04 03:56:49,656 | INFO | Analysis completed. No action taken. 2025-11-04 03:56:49,656 | INFO | Total time to handle duplicate records: 0.18 sec 2025-11-04 03:56:49,656 | INFO | Handling less significant features from data ... 2025-11-04 03:56:52,995 | INFO | Removing Futile columns: ['ticket', 'name'] 2025-11-04 03:56:52,995 | INFO | Sample of Data after removing Futile columns: passenger survived pclass sex age sibsp parch fare cabin embarked automl_id 0 244 0 3 male 22.0 0 0 7.1250 None S 13 1 40 1 3 female 14.0 1 0 11.2417 None C 10 2 162 1 2 female 40.0 0 0 15.7500 None S 14 3 469 0 3 male NaN 0 0 7.7250 None Q 4 4 326 1 1 female 36.0 0 0 135.6333 C32 C 12 5 122 0 3 male NaN 0 0 8.0500 None S 7 6 591 0 3 male 35.0 0 0 7.1250 None S 11 7 387 0 3 male 1.0 5 2 46.9000 None S 15 8 61 0 3 male 22.0 0 0 7.2292 None C 8 9 734 0 2 male 23.0 0 0 13.0000 None S 6 891 rows X 11 columns 2025-11-04 03:56:53,308 | INFO | Total time to handle less significant features: 3.65 sec 2025-11-04 03:56:53,308 | INFO | Handling Date Features ... 2025-11-04 03:56:53,308 | INFO | Analysis Completed. Dataset does not contain any feature related to dates. No action needed. 2025-11-04 03:56:53,308 | INFO | Total time to handle date features: 0.00 sec 2025-11-04 03:56:53,308 | INFO | Checking Missing values in dataset ... 2025-11-04 03:56:54,331 | INFO | Columns with their missing values: cabin: 687 age: 177 embarked: 2 2025-11-04 03:56:55,221 | INFO | Deleting rows of these columns for handling missing values: ['embarked'] 2025-11-04 03:56:55,407 | INFO | Sample of dataset after removing 2 rows: passenger survived pclass sex age sibsp parch fare cabin embarked automl_id 0 244 0 3 male 22.0 0 0 7.1250 None S 13 1 40 1 3 female 14.0 1 0 11.2417 None C 10 2 162 1 2 female 40.0 0 0 15.7500 None S 14 3 122 0 3 male NaN 0 0 8.0500 None S 7 4 387 0 3 male 1.0 5 2 46.9000 None S 15 5 469 0 3 male NaN 0 0 7.7250 None Q 4 6 61 0 3 male 22.0 0 0 7.2292 None C 8 7 326 1 1 female 36.0 0 0 135.6333 C32 C 12 8 591 0 3 male 35.0 0 0 7.1250 None S 11 9 734 0 2 male 23.0 0 0 13.0000 None S 6 889 rows X 11 columns 2025-11-04 03:56:55,752 | INFO | Dropping these columns for handling missing values: ['cabin'] 2025-11-04 03:56:55,752 | INFO | Sample of dataset after removing 1 columns: passenger survived pclass sex age sibsp parch fare embarked automl_id 0 387 0 3 male 1.0 5 2 46.9000 S 15 1 61 0 3 male 22.0 0 0 7.2292 C 8 2 326 1 1 female 36.0 0 0 135.6333 C 12 3 265 0 3 female NaN 0 0 7.7500 Q 5 4 244 0 3 male 22.0 0 0 7.1250 S 13 5 734 0 2 male 23.0 0 0 13.0000 S 6 6 40 1 3 female 14.0 1 0 11.2417 C 10 7 162 1 2 female 40.0 0 0 15.7500 S 14 8 530 0 2 male 23.0 2 1 11.5000 S 9 9 469 0 3 male NaN 0 0 7.7250 Q 4 889 rows X 10 columns 2025-11-04 03:56:56,148 | INFO | Total time to find missing values in data: 2.84 sec 2025-11-04 03:56:56,148 | INFO | Imputing Missing Values ... 2025-11-04 03:56:56,428 | INFO | Columns with their imputation method: age: mean 2025-11-04 03:56:58,701 | INFO | Sample of dataset after Imputation: passenger survived pclass sex age sibsp parch fare embarked automl_id 0 326 1 1 female 36 0 0 135.6333 C 12 1 591 0 3 male 35 0 0 7.1250 S 11 2 387 0 3 male 1 5 2 46.9000 S 15 3 265 0 3 female 29 0 0 7.7500 Q 5 4 244 0 3 male 22 0 0 7.1250 S 13 5 734 0 2 male 23 0 0 13.0000 S 6 6 40 1 3 female 14 1 0 11.2417 C 10 7 162 1 2 female 40 0 0 15.7500 S 14 8 530 0 2 male 23 2 1 11.5000 S 9 9 122 0 3 male 29 0 0 8.0500 S 7 889 rows X 10 columns 2025-11-04 03:56:59,318 | INFO | Time taken to perform imputation: 3.17 sec 2025-11-04 03:56:59,319 | INFO | Performing encoding for categorical columns ... 2025-11-04 03:57:04,854 | INFO | ONE HOT Encoding these Columns: ['sex', 'embarked'] 2025-11-04 03:57:04,854 | INFO | Sample of dataset after performing one hot encoding: survived pclass sex_0 sex_1 age sibsp parch fare embarked_0 embarked_1 embarked_2 automl_id passenger 387 0 3 0 1 1 5 2 46.900 0 0 1 15 448 1 1 0 1 34 0 0 26.550 0 0 1 23 713 1 1 0 1 48 1 0 52.000 0 0 1 27 19 0 3 1 0 31 1 0 18.000 0 0 1 31 263 0 1 0 1 52 1 1 79.650 0 0 1 39 59 1 2 1 0 5 1 2 27.750 0 0 1 43 753 0 3 0 1 33 0 0 9.500 0 0 1 35 856 1 3 1 0 18 0 1 9.350 0 0 1 19 591 0 3 0 1 35 0 0 7.125 0 0 1 11 122 0 3 0 1 29 0 0 8.050 0 0 1 7 889 rows X 13 columns 2025-11-04 03:57:04,974 | INFO | Time taken to encode the columns: 5.66 sec 1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation 2025-11-04 03:57:04,975 | INFO | Data preparation started ... 2025-11-04 03:57:04,975 | INFO | Outlier preprocessing ... 2025-11-04 03:57:07,996 | INFO | Columns with outlier percentage :- ColumnName OutlierPercentage 0 sibsp 5.174353 1 fare 12.823397 2 age 7.311586 3 parch 23.959505 2025-11-04 03:57:08,427 | INFO | Deleting rows of these columns: ['age', 'sibsp'] 2025-11-04 03:57:10,425 | INFO | Sample of dataset after removing outlier rows: survived pclass sex_0 sex_1 age sibsp parch fare embarked_0 embarked_1 embarked_2 automl_id passenger 326 1 1 1 0 36 0 0 135.6333 1 0 0 12 652 1 2 1 0 18 0 1 23.0000 0 0 1 24 509 0 3 0 1 28 0 0 22.5250 0 0 1 28 774 0 3 0 1 29 0 0 7.2250 1 0 0 32 467 0 2 0 1 29 0 0 0.0000 0 0 1 48 242 1 3 1 0 29 1 0 15.5000 0 1 0 56 366 0 3 0 1 30 0 0 7.2500 0 0 1 36 795 0 3 0 1 25 0 0 7.8958 0 0 1 16 61 0 3 0 1 22 0 0 7.2292 1 0 0 8 469 0 3 0 1 29 0 0 7.7250 0 1 0 4 785 rows X 13 columns 2025-11-04 03:57:10,558 | INFO | median inplace of outliers: ['parch', 'fare'] 2025-11-04 03:57:12,694 | INFO | Sample of dataset after performing MEDIAN inplace: survived pclass sex_0 sex_1 age sibsp parch fare embarked_0 embarked_1 embarked_2 automl_id passenger 244 0 3 0 1 22 0 0 7.1250 0 0 1 13 101 0 3 1 0 28 0 0 7.8958 0 0 1 21 570 1 3 0 1 32 0 0 7.8542 0 0 1 25 835 0 3 0 1 18 0 0 8.3000 0 0 1 29 692 1 3 1 0 4 0 0 13.4167 1 0 0 37 284 1 3 0 1 19 0 0 8.0500 0 0 1 41 427 1 2 1 0 28 1 0 26.0000 0 0 1 33 305 0 3 0 1 29 0 0 8.0500 0 0 1 17 530 0 2 0 1 23 2 0 11.5000 0 0 1 9 265 0 3 1 0 29 0 0 7.7500 0 1 0 5 785 rows X 13 columns 2025-11-04 03:57:12,828 | INFO | Time Taken by Outlier processing: 7.85 sec 2025-11-04 03:57:12,829 | INFO | Checking imbalance data ... 2025-11-04 03:57:12,892 | INFO | Imbalance Not Found. 2025-11-04 03:57:13,695 | INFO | Feature selection using rfe ... 2025-11-04 03:57:35,262 | INFO | feature selected by RFE: ['passenger', 'age', 'sex_1', 'pclass', 'sex_0', 'embarked_0', 'sibsp', 'embarked_2', 'fare'] 2025-11-04 03:57:35,264 | INFO | Total time taken by feature selection: 21.57 sec 2025-11-04 03:57:35,678 | INFO | Scaling Features of rfe data ... 2025-11-04 03:57:37,134 | INFO | columns that will be scaled: ['r_passenger', 'r_age', 'r_pclass', 'r_sibsp', 'r_fare'] 2025-11-04 03:57:39,044 | INFO | Dataset sample after scaling: r_embarked_0 survived r_sex_1 r_sex_0 automl_id r_embarked_2 r_passenger r_age r_pclass r_sibsp r_fare 0 0 0 1 0 6 1 0.823596 0.392157 0.5 0.0 0.228070 1 1 0 1 0 8 0 0.067416 0.372549 1.0 0.0 0.126828 2 0 0 1 0 9 1 0.594382 0.392157 0.5 1.0 0.201754 3 1 1 0 1 10 0 0.043820 0.215686 1.0 0.5 0.197223 4 1 1 0 1 12 0 0.365169 0.647059 0.0 0.0 0.228070 5 0 0 1 0 13 1 0.273034 0.372549 1.0 0.0 0.125000 6 0 0 1 0 11 1 0.662921 0.627451 1.0 0.0 0.125000 7 0 0 1 0 7 1 0.135955 0.509804 1.0 0.0 0.141228 8 0 0 0 1 5 0 0.296629 0.509804 1.0 0.0 0.135965 9 0 0 1 0 4 0 0.525843 0.509804 1.0 0.0 0.135526 785 rows X 11 columns 2025-11-04 03:57:39,634 | INFO | Total time taken by feature scaling: 3.96 sec 2025-11-04 03:57:39,634 | INFO | Scaling Features of pca data ... 2025-11-04 03:57:40,725 | INFO | columns that will be scaled: ['passenger', 'pclass', 'age', 'sibsp', 'fare'] 2025-11-04 03:57:42,663 | INFO | Dataset sample after scaling: survived parch embarked_0 sex_1 sex_0 embarked_1 automl_id embarked_2 passenger pclass age sibsp fare 0 1 0 0 0 1 0 14 1 0.180899 0.5 0.725490 0.0 0.276316 1 0 0 1 1 0 0 8 0 0.067416 1.0 0.372549 0.0 0.126828 2 1 0 1 0 1 0 12 0 0.365169 0.0 0.647059 0.0 0.228070 3 0 0 0 1 0 0 7 1 0.135955 1.0 0.509804 0.0 0.141228 4 1 0 0 0 1 0 19 1 0.960674 1.0 0.294118 0.0 0.164035 5 0 0 0 0 1 1 5 0 0.296629 1.0 0.509804 0.0 0.135965 6 0 0 0 1 0 0 9 1 0.594382 0.5 0.392157 1.0 0.201754 7 0 0 0 1 0 0 13 1 0.273034 1.0 0.372549 0.0 0.125000 8 0 0 0 1 0 0 11 1 0.662921 1.0 0.627451 0.0 0.125000 9 0 0 0 1 0 1 4 0 0.525843 1.0 0.509804 0.0 0.135526 785 rows X 13 columns 2025-11-04 03:57:43,271 | INFO | Total time taken by feature scaling: 3.64 sec 2025-11-04 03:57:43,271 | INFO | Dimension Reduction using pca ... 2025-11-04 03:57:43,888 | INFO | PCA columns: ['col_0', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5'] 2025-11-04 03:57:43,889 | INFO | Total time taken by PCA: 0.62 sec 1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation 2025-11-04 03:57:44,323 | INFO | Model Training started ... 2025-11-04 03:57:44,366 | INFO | Hyperparameters used for model training: 2025-11-04 03:57:44,367 | INFO | Model: glm 2025-11-04 03:57:44,367 | INFO | Hyperparameters: {'response_column': 'survived', 'name': 'glm', 'family': 'BINOMIAL', 'lambda1': (0.001, 0.02, 0.1), 'alpha': (0.15, 0.85), 'learning_rate': 'OPTIMAL', 'initial_eta': (0.05, 0.1), 'momentum': (0.65, 0.8, 0.95), 'iter_num_no_change': (5, 10, 50), 'iter_max': (300, 200, 400), 'batch_size': (10, 50, 60, 80)} 2025-11-04 03:57:44,367 | INFO | Total number of models for glm: 1296 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2025-11-04 03:57:44,368 | INFO | Model: svm 2025-11-04 03:57:44,368 | INFO | Hyperparameters: {'response_column': 'survived', 'name': 'svm', 'model_type': 'Classification', 'lambda1': (0.001, 0.02, 0.1), 'alpha': (0.15, 0.85), 'tolerance': (0.001, 0.01), 'learning_rate': 'OPTIMAL', 'initial_eta': (0.05, 0.1), 'momentum': (0.65, 0.8, 0.95), 'nesterov': True, 'intercept': True, 'iter_num_no_change': (5, 10, 50), 'local_sgd_iterations ': (10, 20), 'iter_max': (300, 200, 400), 'batch_size': (10, 50, 60, 80)} 2025-11-04 03:57:44,369 | INFO | Total number of models for svm: 5184 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2025-11-04 03:57:44,370 | INFO | Model: knn 2025-11-04 03:57:44,370 | INFO | Hyperparameters: {'response_column': 'survived', 'name': 'knn', 'model_type': 'Classification', 'k': (3, 5, 6, 8, 10, 12), 'id_column': 'automl_id', 'voting_weight': 1.0} 2025-11-04 03:57:44,370 | INFO | Total number of models for knn: 6 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2025-11-04 03:57:44,370 | INFO | Model: decision_forest 2025-11-04 03:57:44,370 | INFO | Hyperparameters: {'response_column': 'survived', 'name': 'decision_forest', 'tree_type': 'Classification', 'min_impurity': (0.0, 0.1, 0.2), 'max_depth': (5, 6, 8, 10), 'min_node_size': (1, 2, 3), 'num_trees': (-1,), 'seed': 42} 2025-11-04 03:57:44,370 | INFO | Total number of models for decision_forest: 36 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2025-11-04 03:57:44,370 | INFO | Model: xgboost 2025-11-04 03:57:44,370 | INFO | Hyperparameters: {'response_column': 'survived', 'name': 'xgboost', 'model_type': 'Classification', 'column_sampling': (1, 0.6), 'min_impurity': (0.0, 0.1, 0.2), 'lambda1': (1.0, 0.01, 0.1), 'shrinkage_factor': (0.5, 0.1, 0.3), 'max_depth': (5, 6, 8, 10), 'min_node_size': (1, 2, 3), 'iter_num': (10, 20, 30), 'num_boosted_trees': (-1, 5, 10), 'seed': 42} 2025-11-04 03:57:44,372 | INFO | Total number of models for xgboost: 5832 -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2025-11-04 03:57:44,372 | INFO | Performing hyperparameter tuning ... 2025-11-04 03:57:45,606 | INFO | Model training for glm 2025-11-04 03:58:52,451 | INFO | ---------------------------------------------------------------------------------------------------- 2025-11-04 03:58:52,452 | INFO | Model training for svm 2025-11-04 04:00:00,063 | INFO | ---------------------------------------------------------------------------------------------------- 2025-11-04 04:00:00,063 | INFO | Model training for knn 2025-11-04 04:01:35,679 | INFO | ---------------------------------------------------------------------------------------------------- 2025-11-04 04:01:35,680 | INFO | Model training for decision_forest 2025-11-04 04:02:44,755 | INFO | ---------------------------------------------------------------------------------------------------- 2025-11-04 04:02:44,755 | INFO | Model training for xgboost 2025-11-04 04:03:52,610 | INFO | ---------------------------------------------------------------------------------------------------- 2025-11-04 04:03:52,612 | INFO | Leaderboard RANK MODEL_ID FEATURE_SELECTION ACCURACY MICRO-PRECISION ... MACRO-RECALL MACRO-F1 WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 0 1 XGBOOST_18 rfe 0.840764 0.840764 ... 0.837606 0.834730 0.842595 0.840764 0.841368 1 2 XGBOOST_24 rfe 0.840764 0.840764 ... 0.837606 0.834730 0.842595 0.840764 0.841368 2 3 XGBOOST_30 rfe 0.840764 0.840764 ... 0.837606 0.834730 0.842595 0.840764 0.841368 3 4 DECISIONFOREST_10 rfe 0.828025 0.828025 ... 0.815874 0.818482 0.827064 0.828025 0.827230 4 5 DECISIONFOREST_8 rfe 0.828025 0.828025 ... 0.815874 0.818482 0.827064 0.828025 0.827230 .. ... ... ... ... ... ... ... ... ... ... ... 130 131 SVM_2 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 131 132 SVM_10 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 132 133 SVM_18 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 133 134 SVM_26 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 134 135 SVM_34 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 [135 rows x 13 columns] 135 rows X 13 columns 1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation >>> Completed: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿| 100% - 16/16 - Display leaderboard.
>>> aml.leaderboard()
RANK MODEL_ID FEATURE_SELECTION ACCURACY MICRO-PRECISION ... MACRO-RECALL MACRO-F1 WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 0 1 XGBOOST_18 rfe 0.840764 0.840764 ... 0.837606 0.834730 0.842595 0.840764 0.841368 1 2 XGBOOST_24 rfe 0.840764 0.840764 ... 0.837606 0.834730 0.842595 0.840764 0.841368 2 3 XGBOOST_30 rfe 0.840764 0.840764 ... 0.837606 0.834730 0.842595 0.840764 0.841368 3 4 DECISIONFOREST_10 rfe 0.828025 0.828025 ... 0.815874 0.818482 0.827064 0.828025 0.827230 4 5 DECISIONFOREST_8 rfe 0.828025 0.828025 ... 0.815874 0.818482 0.827064 0.828025 0.827230 .. ... ... ... ... ... ... ... ... ... ... ... 130 131 SVM_2 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 131 132 SVM_10 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 132 133 SVM_18 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 133 134 SVM_26 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132 134 135 SVM_34 rfe 0.649682 0.649682 ... 0.559253 0.499507 0.735344 0.649682 0.557132
- Display best performing model.
>>> aml.leader()
RANK MODEL_ID FEATURE_SELECTION ACCURACY MICRO-PRECISION ... MACRO-RECALL MACRO-F1 WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 0 1 XGBOOST_18 rfe 0.840764 0.840764 ... 0.837606 0.83473 0.842595 0.840764 0.841368 [1 rows x 13 columns]
- Display model hyperparameters for trained model.
>>> aml.model_hyperparameters(rank=1)
{'response_column': 'survived', 'name': 'xgboost', 'model_type': 'Classification', 'column_sampling': 1, 'min_impurity': 0.0, 'lambda1': 1.0, 'shrinkage_factor': 0.5, 'max_depth': 5, 'min_node_size': 2, 'iter_num': 10, 'num_boosted_trees': -1, 'seed': 42, 'persist': False, 'output_prob': True, 'output_responses': ['1', '0']}>>> aml.model_hyperparameters(rank=3)
{'response_column': 'survived', 'name': 'xgboost', 'model_type': 'Classification', 'column_sampling': 1, 'min_impurity': 0.0, 'lambda1': 1.0, 'shrinkage_factor': 0.5, 'max_depth': 5, 'min_node_size': 2, 'iter_num': 30, 'num_boosted_trees': -1, 'seed': 42, 'persist': False, 'output_prob': True, 'output_responses': ['1', '0']} - Deploy models to the database using one of the following methods:
- Using top_n argument:
>>> aml.deploy(table_name='top_models', top_n=10)
Model Deployment Completed Successfully.
- Passing list of ranks using ranks argument:
>>> aml.deploy(table_name='mixed_models', ranks=[2,4,6])
Model Deployment Completed Successfully.
- Passing range object using ranks argument:
>>> aml.deploy(table_name='ranged_models', ranks=range(3,10))
Model Deployment Completed Successfully.
- Using top_n argument:
- Load models using one of the following methods:
- Create an instance of AutoML object in a separate session:
- Create an instance of AutoML.
>>> aml_tp = AutoML()
- Load models using table name.
>>> top_models = aml_tp.load(table_name="top_models")
>>> top_models
RANK MODEL_ID FEATURE_SELECTION ACCURACY ... WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 DATA_TABLE 0 1 XGBOOST_18 rfe 0.840764 ... 0.842595 0.840764 0.841368 ml__survived_rfe_1762261663015435 1 2 XGBOOST_24 rfe 0.840764 ... 0.842595 0.840764 0.841368 ml__survived_rfe_1762261663015435 2 3 XGBOOST_30 rfe 0.840764 ... 0.842595 0.840764 0.841368 ml__survived_rfe_1762261663015435 3 4 DECISIONFOREST_10 rfe 0.828025 ... 0.827064 0.828025 0.827230 ml__survived_rfe_1762261663015435 4 5 DECISIONFOREST_8 rfe 0.828025 ... 0.827064 0.828025 0.827230 ml__survived_rfe_1762261663015435 5 6 XGBOOST_0 rfe 0.821656 ... 0.821656 0.821656 0.821656 ml__survived_rfe_1762261663015435 6 7 XGBOOST_6 rfe 0.821656 ... 0.821656 0.821656 0.821656 ml__survived_rfe_1762261663015435 7 8 XGBOOST_12 rfe 0.821656 ... 0.821656 0.821656 0.821656 ml__survived_rfe_1762261663015435 8 9 DECISIONFOREST_6 rfe 0.821656 ... 0.820502 0.821656 0.820522 ml__survived_rfe_1762261663015435 9 10 XGBOOST_20 rfe 0.821656 ... 0.820502 0.821656 0.820522 ml__survived_rfe_1762261663015435 [10 rows x 14 columns]
- Generate prediction and evaluation metrics.
>>> test_data = df.iloc[:200]
Generate predictions using rank.
>>> preds = aml_tp.predict(data=test_data, rank=2)
2025-11-04 04:11:40,729 | INFO | Generating prediction using: 2025-11-04 04:11:40,729 | INFO | Model Name: XGBOOST 2025-11-04 04:11:40,729 | INFO | Feature Selection: rfe Completed: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿| 100% - 9/9
Generate a predictions sample.
>>> preds
automl_id Prediction Prob_1 Prob_0 survived 0 304 0 0.277101 0.722899 0 1 372 0 0.416882 0.583118 0 2 532 0 0.180903 0.819097 0 3 524 0 0.138616 0.861384 0 4 20 0 0.279118 0.720882 0 5 744 0 0.424129 0.575871 0 6 752 1 0.730426 0.269574 1 7 12 1 0.770348 0.229652 1 8 80 1 0.740828 0.259172 1 9 72 1 0.669231 0.330769 1
Generate evaluation metrics.
>>> metrics = aml_tp.evaluate(data=test_data, rank=2)
2025-11-04 04:12:25,854 | INFO | Generating performance metrics using: 2025-11-04 04:12:25,854 | INFO | Model Name: XGBOOST 2025-11-04 04:12:25,854 | INFO | Feature Selection: rfe
>>> metrics
############ output_data Output ############ SeqNum Metric MetricValue 0 3 Micro-Recall 0.804020 1 5 Macro-Precision 0.797470 2 6 Macro-Recall 0.829928 3 7 Macro-F1 0.797389 4 9 Weighted-Recall 0.804020 5 10 Weighted-F1 0.808993 6 8 Weighted-Precision 0.843323 7 4 Micro-F1 0.804020 8 2 Micro-Precision 0.804020 9 1 Accuracy 0.804020
############ result Output ############ Prediction Mapping CLASS_1 CLASS_2 Precision Recall F1 Support SeqNum 1 1 CLASS_2 33 62 0.652632 0.911765 0.760736 68 0 0 CLASS_1 98 6 0.942308 0.748092 0.834043 131
- Create an instance of AutoML.
- Using existing AutoML object in the same session.
- Load models using table name.
>>> range_models = aml.load(table_name="top_models")
>>> range_models
RANK MODEL_ID FEATURE_SELECTION ACCURACY ... WEIGHTED-PRECISION WEIGHTED-RECALL WEIGHTED-F1 DATA_TABLE 0 1 XGBOOST_30 rfe 0.840764 ... 0.842595 0.840764 0.841368 ml__survived_rfe_1762260166279889 1 2 DECISIONFOREST_10 rfe 0.828025 ... 0.827064 0.828025 0.827230 ml__survived_rfe_1762260166279889 2 3 DECISIONFOREST_8 rfe 0.828025 ... 0.827064 0.828025 0.827230 ml__survived_rfe_1762260166279889 3 4 XGBOOST_0 rfe 0.821656 ... 0.821656 0.821656 0.821656 ml__survived_rfe_1762260166279889 4 5 XGBOOST_6 rfe 0.821656 ... 0.821656 0.821656 0.821656 ml__survived_rfe_1762260166279889 5 6 XGBOOST_12 rfe 0.821656 ... 0.821656 0.821656 0.821656 ml__survived_rfe_1762260166279889 6 7 DECISIONFOREST_6 rfe 0.821656 ... 0.820502 0.821656 0.820522 ml__survived_rfe_1762260166279889 7 8 XGBOOST_20 rfe 0.821656 ... 0.820502 0.821656 0.820522 ml__survived_rfe_1762260166279889
- Generate prediction and evaluation metrics.Predict and evaluate can be utilized by both fitted models and loaded models. When generating predictions and evaluations using loaded models within the same session using an existing AutoML instance, the parameter use_loaded_models is set to True. Otherwise, fitted models are utilized for generating predictions and evaluation metrics instead of loaded models.
>>> test_data = df.iloc[:200]
Generate predictions using rank.
>>> preds = aml.predict(data=test_data, rank=3, use_loaded_models=True)
2025-11-04 04:16:15,112 | INFO | Data Transformation started ... 2025-11-04 04:16:15,112 | INFO | Performing transformation carried out in feature engineering phase ... 2025-11-04 04:16:16,003 | INFO | Updated dataset after dropping futile columns : passenger survived pclass sex age sibsp parch fare cabin embarked automl_id 0 3 1 3 female 26.0 0 0 7.9250 None S 12 1 5 0 3 male 35.0 0 0 8.0500 None S 20 2 6 0 3 male NaN 0 0 8.4583 None Q 24 3 7 0 1 male 54.0 0 0 51.8625 E46 S 28 4 9 1 3 female 27.0 0 2 11.1333 None S 36 5 10 1 2 female 14.0 1 0 30.0708 None C 40 6 8 0 3 male 2.0 3 1 21.0750 None S 32 7 4 1 1 female 35.0 1 0 53.1000 C123 S 16 8 2 1 1 female 38.0 1 0 71.2833 C85 C 8 9 1 0 3 male 22.0 1 0 7.2500 None S 4 200 rows X 11 columns 2025-11-04 04:16:16,561 | INFO | Updated dataset after performing target column transformation : passenger survived pclass sex age sibsp parch fare cabin embarked automl_id 0 3 1 3 female 26.0 0 0 7.9250 None S 12 1 5 0 3 male 35.0 0 0 8.0500 None S 20 2 6 0 3 male NaN 0 0 8.4583 None Q 24 3 7 0 1 male 54.0 0 0 51.8625 E46 S 28 4 9 1 3 female 27.0 0 2 11.1333 None S 36 5 10 1 2 female 14.0 1 0 30.0708 None C 40 6 8 0 3 male 2.0 3 1 21.0750 None S 32 7 4 1 1 female 35.0 1 0 53.1000 C123 S 16 8 2 1 1 female 38.0 1 0 71.2833 C85 C 8 9 1 0 3 male 22.0 1 0 7.2500 None S 4 200 rows X 11 columns 2025-11-04 04:16:17,051 | INFO | Updated dataset after dropping missing value containing columns : passenger survived pclass sex age sibsp parch fare embarked automl_id 0 3 1 3 female 26.0 0 0 7.9250 S 12 1 5 0 3 male 35.0 0 0 8.0500 S 20 2 6 0 3 male NaN 0 0 8.4583 Q 24 3 7 0 1 male 54.0 0 0 51.8625 S 28 4 9 1 3 female 27.0 0 2 11.1333 S 36 5 10 1 2 female 14.0 1 0 30.0708 C 40 6 8 0 3 male 2.0 3 1 21.0750 S 32 7 4 1 1 female 35.0 1 0 53.1000 S 16 8 2 1 1 female 38.0 1 0 71.2833 C 8 9 1 0 3 male 22.0 1 0 7.2500 S 4 200 rows X 10 columns 2025-11-04 04:16:18,469 | INFO | Updated dataset after imputing missing value containing columns : passenger survived pclass sex age sibsp parch fare embarked automl_id 0 3 1 3 female 26 0 0 7.9250 S 12 1 5 0 3 male 35 0 0 8.0500 S 20 2 6 0 3 male 29 0 0 8.4583 Q 24 3 7 0 1 male 54 0 0 51.8625 S 28 4 9 1 3 female 27 0 2 11.1333 S 36 5 10 1 2 female 14 1 0 30.0708 C 40 6 8 0 3 male 2 3 1 21.0750 S 32 7 4 1 1 female 35 1 0 53.1000 S 16 8 2 1 1 female 38 1 0 71.2833 C 8 9 1 0 3 male 22 1 0 7.2500 S 4 200 rows X 10 columns 2025-11-04 04:16:20,014 | INFO | Found additional 1 rows that contain missing values : passenger survived pclass sex age sibsp parch fare embarked automl_id 0 3 1 3 female 26 0 0 7.9250 S 12 1 5 0 3 male 35 0 0 8.0500 S 20 2 6 0 3 male 29 0 0 8.4583 Q 24 3 7 0 1 male 54 0 0 51.8625 S 28 4 9 1 3 female 27 0 2 11.1333 S 36 5 10 1 2 female 14 1 0 30.0708 C 40 6 8 0 3 male 2 3 1 21.0750 S 32 7 4 1 1 female 35 1 0 53.1000 S 16 8 2 1 1 female 38 1 0 71.2833 C 8 9 1 0 3 male 22 1 0 7.2500 S 4 200 rows X 10 columns 2025-11-04 04:16:20,733 | INFO | Updated dataset after dropping additional missing value containing rows : passenger survived pclass sex age sibsp parch fare embarked automl_id 0 3 1 3 female 26 0 0 7.9250 S 12 1 5 0 3 male 35 0 0 8.0500 S 20 2 6 0 3 male 29 0 0 8.4583 Q 24 3 7 0 1 male 54 0 0 51.8625 S 28 4 9 1 3 female 27 0 2 11.1333 S 36 5 10 1 2 female 14 1 0 30.0708 C 40 6 8 0 3 male 2 3 1 21.0750 S 32 7 4 1 1 female 35 1 0 53.1000 S 16 8 2 1 1 female 38 1 0 71.2833 C 8 9 1 0 3 male 22 1 0 7.2500 S 4 199 rows X 10 columns 2025-11-04 04:16:25,440 | INFO | Updated dataset after performing categorical encoding : survived pclass sex_0 sex_1 age sibsp parch fare embarked_0 embarked_1 embarked_2 automl_id passenger 80 1 3 1 0 30 0 0 12.4750 0 0 1 320 200 0 2 1 0 24 0 0 13.0000 0 0 1 800 57 1 2 1 0 21 0 0 10.5000 0 0 1 228 118 0 2 0 1 29 1 0 21.0000 0 0 1 472 55 0 1 0 1 65 0 1 61.9792 1 0 0 220 95 0 3 0 1 59 0 0 7.2500 0 0 1 380 158 0 3 0 1 30 0 0 8.0500 0 0 1 632 120 0 3 1 0 2 4 2 31.2750 0 0 1 480 162 1 2 1 0 40 0 0 15.7500 0 0 1 648 40 1 3 1 0 14 1 0 11.2417 1 0 0 160 199 rows X 13 columns 2025-11-04 04:16:25,657 | INFO | Performing transformation carried out in data preparation phase ... 2025-11-04 04:16:26,692 | INFO | Updated dataset after performing RFE feature selection: automl_id passenger age sex_1 pclass sex_0 embarked_0 sibsp embarked_2 fare survived 0 380 95 59 1 3 0 0 0 1 7.2500 0 288 72 16 0 3 1 0 5 1 46.9000 0 448 112 14 0 3 1 1 1 0 14.4542 0 600 150 42 1 2 0 0 0 1 13.0000 0 760 190 36 1 3 0 0 0 1 7.8958 0 508 127 29 1 3 0 0 0 0 7.7500 1 320 80 30 0 3 1 0 0 1 12.4750 1 692 173 1 0 3 1 0 1 1 11.1333 1 440 110 29 0 3 1 0 1 0 24.1500 1 668 167 29 0 1 1 0 0 1 55.0000 199 rows X 11 columns 2025-11-04 04:16:27,790 | INFO | Updated dataset after performing scaling on RFE selected features : survived r_embarked_0 r_sex_1 r_sex_0 automl_id r_embarked_2 r_passenger r_age r_pclass r_sibsp r_fare 0 1 0 0 1 440 0 0.122472 0.509804 1.0 0.5 0.423684 1 1 1 1 0 148 0 0.040449 0.509804 1.0 0.0 0.126828 2 1 0 0 1 216 1 0.059551 0.509804 0.5 0.5 0.456140 3 1 0 0 1 48 1 0.012360 1.078431 0.0 0.0 0.465789 4 1 0 0 1 428 1 0.119101 0.352941 1.0 0.0 0.134211 5 1 0 1 0 588 1 0.164045 0.470588 1.0 0.0 0.136768 6 0 0 1 0 472 1 0.131461 0.509804 0.5 0.5 0.368421 7 0 1 1 0 220 0 0.060674 1.215686 0.0 0.0 1.087354 8 0 0 1 0 380 1 0.105618 1.098039 1.0 0.0 0.127193 9 0 0 1 0 540 1 0.150562 0.431373 0.5 0.0 0.228070 199 rows X 11 columns 2025-11-04 04:16:29,295 | INFO | Updated dataset after performing scaling for PCA feature selection : survived parch embarked_0 sex_1 sex_0 embarked_1 automl_id embarked_2 passenger pclass age sibsp fare 0 1 0 0 0 1 1 440 0 0.122472 1.0 0.509804 0.5 0.423684 1 1 0 1 1 0 0 148 0 0.040449 1.0 0.509804 0.0 0.126828 2 1 0 0 0 1 0 216 1 0.059551 0.5 0.509804 0.5 0.456140 3 1 0 0 0 1 0 48 1 0.012360 0.0 1.078431 0.0 0.465789 4 1 0 0 0 1 0 428 1 0.119101 1.0 0.352941 0.0 0.134211 5 1 0 0 1 0 0 588 1 0.164045 1.0 0.470588 0.0 0.136768 6 0 0 0 1 0 0 472 1 0.131461 0.5 0.509804 0.5 0.368421 7 0 1 1 1 0 0 220 0 0.060674 0.0 1.215686 0.0 1.087354 8 0 0 0 1 0 0 380 1 0.105618 1.0 1.098039 0.0 0.127193 9 0 0 0 1 0 0 540 1 0.150562 0.5 0.431373 0.0 0.228070 199 rows X 13 columns 2025-11-04 04:16:29,806 | INFO | Updated dataset after performing PCA feature selection : automl_id col_0 col_1 col_2 col_3 col_4 col_5 survived 0 320 -0.656874 0.673082 0.380152 -0.293068 -0.292006 -0.270122 1 1 472 0.503643 0.142031 -0.289854 0.029587 -0.487313 0.243953 0 2 692 -0.702411 0.687208 0.403352 -0.382652 -0.312103 0.237384 1 3 220 -0.040294 -1.194752 -0.859080 0.149510 -0.453040 -0.164186 0 4 440 -1.047373 -0.132787 0.872103 0.641029 -0.481628 0.306012 1 5 380 0.650788 0.160572 0.157318 -0.096347 -0.343791 -0.222092 0 6 668 -0.865475 0.615017 -0.630070 0.198409 -0.308723 -0.237410 1 7 540 0.554400 0.130225 -0.195804 0.014521 -0.339744 -0.239373 0 8 148 0.174425 -1.135319 0.293498 -0.470572 -0.349895 -0.192387 1 9 288 -0.909726 0.736056 0.015749 -0.322403 -0.942484 2.179461 0 10 rows X 8 columns 2025-11-04 04:16:30,200 | INFO | Data Transformation completed.⫿⫿⫿⫿⫿| 100% - 9/9 2025-11-04 04:16:30,787 | INFO | Following model is being picked for evaluation: 2025-11-04 04:16:30,788 | INFO | Model ID : XGBOOST_30 2025-11-04 04:16:30,788 | INFO | Feature Selection Method : rfe 2025-11-04 04:16:31,361 | INFO | Applying SHAP for Model Interpretation... 2025-11-04 04:16:33,742 | INFO | SHAP Analysis Completed. Feature Importance Available. /root/automl_testing/pyTeradata/teradataml/automl/model_evaluation.py:380: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown plt.show() 2025-11-04 04:16:33,881 | INFO | Prediction : automl_id Prediction survived prob_0 prob_1 0 440 1 1 0.276788 0.723212 1 148 1 1 0.497701 0.502299 2 216 1 1 0.060608 0.939392 3 48 1 1 0.029273 0.970727 4 428 1 1 0.131228 0.868772 5 588 0 1 0.525762 0.474238 6 472 0 0 0.858917 0.141083 7 220 1 0 0.456566 0.543434 8 380 0 0 0.722899 0.277101 9 540 0 0 0.858917 0.141083 2025-11-04 04:16:35,814 | INFO | ROC-AUC : GINI AUC 0.913056 0.826111 threshold_value tpr fpr 0 0.040816 1.0 1.000000 1 0.081633 1.0 1.000000 2 0.102041 1.0 1.000000 3 0.122449 1.0 0.969466 4 0.163265 1.0 0.694656 5 0.183673 1.0 0.641221 6 0.142857 1.0 0.763359 7 0.061224 1.0 1.000000 8 0.020408 1.0 1.000000 9 0.000000 1.0 1.000000 2025-11-04 04:16:36,164 | INFO | Confusion Matrix : [[98 33] [ 6 62]]Generate prediction sample.
>>> preds
automl_id Prediction survived prob_0 prob_1 0 380 0 0 0.722899 0.277101 1 288 1 0 0.437518 0.562482 2 448 0 0 0.598358 0.401642 3 600 0 0 0.858917 0.141083 4 760 0 0 0.824824 0.175176 5 508 0 0 0.722899 0.277101 6 320 1 1 0.463271 0.536729 7 692 1 1 0.380221 0.619779 8 440 1 1 0.276788 0.723212 9 668 1 1 0.039959 0.960041
Generate evaluation metrics.
>>> metrics = aml.evaluate(data=test_data, rank=3, use_loaded_models=True)
2025-11-04 04:17:51,732 | INFO | Skipping data transformation as data is already transformed. 2025-11-04 04:17:52,281 | INFO | Following model is being picked for evaluation: 2025-11-04 04:17:52,282 | INFO | Model ID : XGBOOST_30 2025-11-04 04:17:52,282 | INFO | Feature Selection Method : rfe 2025-11-04 04:17:54,779 | INFO | Performance Metrics : Prediction Mapping CLASS_1 CLASS_2 Precision Recall F1 Support SeqNum 0 0 CLASS_1 98 6 0.942308 0.748092 0.834043 131 1 1 CLASS_2 33 62 0.652632 0.911765 0.760736 68 -------------------------------------------------------------------------------- SeqNum Metric MetricValue 0 3 Micro-Recall 0.804020 1 5 Macro-Precision 0.797470 2 6 Macro-Recall 0.829928 3 7 Macro-F1 0.797389 4 9 Weighted-Recall 0.804020 5 10 Weighted-F1 0.808993 6 8 Weighted-Precision 0.843323 7 4 Micro-F1 0.804020 8 2 Micro-Precision 0.804020 9 1 Accuracy 0.804020>>> metrics
Prediction Mapping CLASS_1 CLASS_2 Precision Recall F1 Support SeqNum 0 0 CLASS_1 98 6 0.942308 0.748092 0.834043 131 1 1 CLASS_2 33 62 0.652632 0.911765 0.760736 68
- Load models using table name.
- Create an instance of AutoML object in a separate session:
- Removed saved models from the database by passing the table name to the API.
>>> aml.remove_saved_models(table_name="mixed_models")