AutoClassifier for multiclass classification using early stopping timer - Example 6: Run AutoClassifier for Multiclass Classification Problem using Early Stopping Timer - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2026-01-07
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

This example predicts the species of iris flower based on different factors.

Run AutoML to acquire the most effective model with the following specifications:
  • Use early stopping timer to 100 sec.
  • Include only 'xgboost' model for training.
  • Opt for verbose level 2 to get detailed log.
  • Add customization for some specific processes of AutoClassifier.
  1. Load data and split it to train and test datasets.
    1. Load the example data and create teradataml DataFrame.
      >>> load_example_data("teradataml", "iris_input")
    2. Perform sampling to get 80% for training and 20% for testing.
      >>> iris_sample = iris.sample(frac = [0.8, 0.2])
    3. Fetch train and test data.
      >>> iris_train= iris_sample[iris_sample['sampleid'] == 1].drop('sampleid', axis=1)
      >>> iris_test = iris_sample[iris_sample['sampleid'] == 2].drop('sampleid', axis=1)
  2. Add customization.
    >>> AutoClassifier.generate_custom_config("custom_iris")
    Generating custom config JSON for AutoML ...
    
    Available main options for customization with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Feature Engineering Phase
    
    Index 2: Customize Data Preparation Phase
    
    Index 3: Customize Model Training Phase
    
    Index 4: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the index you want to customize:  2
    
    Customizing Data Preparation Phase ...
    
    Available options for customization of data preparation phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Data Imbalance Handling
    
    Index 2: Customize Outlier Handling
    
    Index 3: Customize Feature Scaling
    
    Index 4: Back to main menu
    
    Index 5: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in data preparation phase:  1,3
    
    Customizing Data Imbalance Handling ...
    
    Available data sampling methods with corresponding indices:
    Index 1: SMOTE
    Index 2: NearMiss
    
    Enter the corresponding index data imbalance handling method:  1
    
    Customization of data imbalance handling has been completed successfully.
    
    Available feature scaling methods with corresponding indices:
    Index 1: maxabs
    Index 2: mean
    Index 3: midrange
    Index 4: range
    Index 5: rescale
    Index 6: std
    Index 7: sum
    Index 8: ustd
    
    Enter the corresponding index feature scaling method:  6
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  1
    
    Customization of feature scaling has been completed successfully.
    
    Available options for customization of data preparation phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Data Imbalance Handling
    
    Index 2: Customize Outlier Handling
    
    Index 3: Customize Feature Scaling
    
    Index 4: Back to main menu
    
    Index 5: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in data preparation phase:  4
    
    Customization of data preparation phase has been completed successfully.
    
    Available main options for customization with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Feature Engineering Phase
    
    Index 2: Customize Data Preparation Phase
    
    Index 3: Customize Model Training Phase
    
    Index 4: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the index you want to customize:  3
    
    Customizing Model Training Phase ...
    
    Available options for customization of model training phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Model Hyperparameter
    
    Index 2: Back to main menu
    
    Index 3: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in model training phase:  1
    
    Customizing Model Hyperparameter ...
    
    Available models for hyperparameter tuning with corresponding indices:
    Index 1: decision_forest
    Index 2: xgboost
    Index 3: knn
    Index 4: glm
    Index 5: svm
    
    Available hyperparamters update methods with corresponding indices:
    Index 1: ADD
    Index 2: REPLACE
    
    Enter the list of model indices for performing hyperparameter tuning:  2
    
    Available hyperparameters for model 'xgboost' with corresponding indices:
    Index 1: min_impurity
    Index 2: max_depth
    Index 3: min_node_size
    Index 4: shrinkage_factor
    Index 5: iter_num
    
    Enter the list of hyperparameter indices for model 'xgboost':  2
    
    Enter the index of corresponding update method for hyperparameters 'max_depth' for model 'xgboost':  2
    
    Enter the list of value for hyperparameter 'max_depth' for model 'xgboost':  3, 4
    
    Customization of model hyperparameter has been completed successfully.
    
    Available options for customization of model training phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Model Hyperparameter
    
    Index 2: Back to main menu
    
    Index 3: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in model training phase:  3
    
    Customization of model training phase has been completed successfully.
    
    Process of generating custom config file for AutoML has been completed successfully.
    
    'custom_iris.json' file is generated successfully under the current working directory.
  3. Create an AutoML instance.
    >>> aml = AutoClassifier(include=['xgboost'],
    >>>                      verbose=2,
    >>>                      max_runtime_secs=100,
    >>>                      custom_config_file='custom_iris.json')
  4. Fit training data.
    >>> aml.fit(iris_train, iris_train.species)
    2025-11-04 03:31:31,896 | INFO     | Received below input for customization :
    {
        "DataImbalanceIndicator": true,
        "DataImbalanceMethod": "SMOTE",
        "FeatureScalingIndicator": true,
        "FeatureScalingParam": {
            "FeatureScalingMethod": "std",
            "volatile": true
        },
        "HyperparameterTuningIndicator": true,
        "HyperparameterTuningParam": {
            "xgboost": {
                "max_depth": {
                    "Value": [
                        3,
                        4
                    ],
                    "Method": "REPLACE"
                }
            }
        }
    }
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 03:31:31,896 | INFO     | Feature Exploration started
    2025-11-04 03:31:31,896 | INFO     | Data Overview:
    2025-11-04 03:31:31,938 | INFO     | Total Rows in the data: 120
    2025-11-04 03:31:31,960 | INFO     | Total Columns in the data: 6
    2025-11-04 03:31:32,678 | INFO     | Column Summary:
         ColumnName Datatype  NonNullCount  NullCount BlankCount  ZeroCount  PositiveCount  NegativeCount  NullPercentage  NonNullPercentage
    0  petal_length    FLOAT           120          0       None          0            120              0             0.0              100.0
    1   petal_width    FLOAT           120          0       None          0            120              0             0.0              100.0
    2            id  INTEGER           120          0       None          0            120              0             0.0              100.0
    3       species  INTEGER           120          0       None          0            120              0             0.0              100.0
    4   sepal_width    FLOAT           120          0       None          0            120              0             0.0              100.0
    5  sepal_length    FLOAT           120          0       None          0            120              0             0.0              100.0
    2025-11-04 03:31:33,426 | INFO     | Statistics of Data:
          ATTRIBUTE StatName  StatValue
    0   petal_width  MAXIMUM        2.5
    1            id  MINIMUM        1.0
    2            id  MAXIMUM      150.0
    3  sepal_length    COUNT      120.0
    4  sepal_length  MAXIMUM        7.9
    5  petal_length    COUNT      120.0
    6  petal_length  MINIMUM        1.0
    7  petal_length  MAXIMUM        6.9
    8  sepal_length  MINIMUM        4.4
    9            id    COUNT      120.0
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 03:31:36,494 | INFO     | Columns with outlier percentage :-
        ColumnName  OutlierPercentage
    0  sepal_width           3.333333
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 03:31:36,793 | INFO     | Feature Engineering started ...
    2025-11-04 03:31:36,793 | INFO     | Handling duplicate records present in dataset ...
    2025-11-04 03:31:36,931 | INFO     | Analysis completed. No action taken.
    2025-11-04 03:31:36,931 | INFO     | Total time to handle duplicate records: 0.14 sec
    2025-11-04 03:31:36,931 | INFO     | Starting customized anti-select columns ...
    2025-11-04 03:31:36,931 | INFO     | Skipping customized anti-select columns.
    2025-11-04 03:31:36,932 | INFO     | Handling less significant features from data ...
    2025-11-04 03:31:37,642 | INFO     | Total time to handle less significant features: 0.71 sec
    2025-11-04 03:31:37,642 | INFO     | Handling Date Features ...
    2025-11-04 03:31:37,642 | INFO     | Analysis Completed. Dataset does not contain any feature related to dates. No action needed.
    2025-11-04 03:31:37,642 | INFO     | Total time to handle date features: 0.00 sec
    2025-11-04 03:31:37,642 | INFO     | Proceeding with default option for missing value imputation.
    2025-11-04 03:31:37,642 | INFO     | Proceeding with default option for handling remaining missing values.
    2025-11-04 03:31:37,642 | INFO     | Checking Missing values in dataset ...
    2025-11-04 03:31:39,174 | INFO     | Analysis Completed. No Missing Values Detected.
    2025-11-04 03:31:39,174 | INFO     | Total time to find missing values in data: 1.53 sec
    2025-11-04 03:31:39,174 | INFO     | Imputing Missing Values ...
    2025-11-04 03:31:39,174 | INFO     | Analysis completed. No imputation required.
    2025-11-04 03:31:39,174 | INFO     | Time taken to perform imputation: 0.00 sec
    2025-11-04 03:31:39,175 | INFO     | No information provided for Variable-Width Transformation.
    2025-11-04 03:31:39,175 | INFO     | Skipping customized string manipulation.
    2025-11-04 03:31:39,175 | INFO     | Starting Customized Categorical Feature Encoding ...
    2025-11-04 03:31:39,175 | INFO     | AutoML will proceed with default encoding technique.
    2025-11-04 03:31:39,175 | INFO     | Performing encoding for categorical columns ...
    2025-11-04 03:31:39,521 | INFO     | Analysis completed. No categorical columns were found.
    2025-11-04 03:31:39,521 | INFO     | Time taken to encode the columns: 0.35 sec
    2025-11-04 03:31:39,521 | INFO     | Starting customized mathematical transformation ...
    2025-11-04 03:31:39,521 | INFO     | Skipping customized mathematical transformation.
    2025-11-04 03:31:39,522 | INFO     | Starting customized non-linear transformation ...
    2025-11-04 03:31:39,522 | INFO     | Skipping customized non-linear transformation.
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 03:31:39,522 | INFO     | Data preparation started ...
    2025-11-04 03:31:39,522 | INFO     | Starting customized outlier processing ...
    2025-11-04 03:31:39,522 | INFO     | No information provided for customized outlier processing. AutoML will proceed with default settings.
    2025-11-04 03:31:39,522 | INFO     | Outlier preprocessing ...
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 03:31:42,708 | INFO     | Columns with outlier percentage :-
        ColumnName  OutlierPercentage
    0  sepal_width           3.333333
    2025-11-04 03:31:43,301 | INFO     | Deleting rows of these columns:
    ['sepal_width']
    2025-11-04 03:31:45,481 | INFO     | Sample of dataset after removing outlier rows:
         sepal_length  sepal_width  petal_length  petal_width  species  automl_id
    id
    99            5.1          2.5           3.0          1.1        2         15
    97            5.7          2.9           4.2          1.3        2         23
    15            5.8          4.0           1.2          0.2        1         27
    53            6.9          3.1           4.9          1.5        2         31
    30            4.7          3.2           1.6          0.2        1         39
    91            5.5          2.6           4.4          1.2        2         43
    114           5.7          2.5           5.0          2.0        3         35
    36            5.0          3.2           1.2          0.2        1         19
    59            6.6          2.9           4.6          1.3        2         11
    19            5.7          3.8           1.7          0.3        1          7
    116 rows X 7 columns
    2025-11-04 03:31:45,592 | INFO     | Time Taken by Outlier processing: 6.07 sec
    2025-11-04 03:31:45,592 | INFO     | Checking imbalance data ...
    2025-11-04 03:31:45,674 | INFO     | Imbalance Not Found.
    2025-11-04 03:31:46,414 | INFO     | Feature selection using rfe ...
    2025-11-04 03:31:53,335 | INFO     | feature selected by RFE:
    ['id', 'petal_length']
    2025-11-04 03:31:53,337 | INFO     | Total time taken by feature selection: 6.92 sec
    2025-11-04 03:31:53,616 | INFO     | Scaling Features of rfe data ...
    2025-11-04 03:31:54,390 | INFO     | columns that will be scaled:
    ['r_id', 'r_petal_length']
    2025-11-04 03:31:55,926 | INFO     | Dataset sample after scaling:
       automl_id  species      r_id  r_petal_length
    0          7        1 -1.301357       -1.165924
    1          9        1 -1.346574       -1.391996
    2         10        3  0.982108        0.699165
    3         11        2 -0.397014        0.473093
    4         13        3  1.411671        0.586129
    5         14        2 -0.442231        0.529611
    6         12        2 -0.012669        0.360058
    7          8        1 -0.871794       -1.335478
    8          6        2  0.077766       -0.148603
    9          5        3  1.456888        1.038272
    116 rows X 4 columns
    2025-11-04 03:31:56,391 | INFO     | Total time taken by feature scaling: 2.77 sec
    2025-11-04 03:31:56,391 | INFO     | Scaling Features of pca data ...
    2025-11-04 03:31:56,889 | INFO     | columns that will be scaled:
    ['id', 'sepal_length', 'sepal_width', 'petal_length', 'petal_width']
    2025-11-04 03:31:58,567 | INFO     | Dataset sample after scaling:
       automl_id  species        id  sepal_length  sepal_width  petal_length  petal_width
    0         13        3  1.411671      0.174687    -0.099951      0.586129     0.802926
    1         10        3  0.982108      0.174687    -2.032346      0.699165     0.404894
    2         14        2 -0.442231      0.524062     0.624696      0.529611     0.537572
    3          8        1 -0.871794     -1.106352     1.349344     -1.335478    -1.452587
    4         16        2  0.371677     -0.058229    -1.066149      0.133986     0.006863
    5          7        1 -1.301357     -0.174687     1.832443     -1.165924    -1.187233
    6         11        2 -0.397014      0.873436    -0.341501      0.473093     0.139540
    7         15        2  0.507328     -0.873436    -1.307698     -0.431192    -0.125815
    8         12        2 -0.012669      0.873436    -0.099951      0.360058     0.272217
    9          6        2  0.077766     -0.174687    -1.066149     -0.148603    -0.258492
    116 rows X 7 columns
    2025-11-04 03:31:59,114 | INFO     | Total time taken by feature scaling: 2.72 sec
    2025-11-04 03:31:59,114 | INFO     | Dimension Reduction using pca ...
    2025-11-04 03:31:59,749 | INFO     | PCA columns:
    ['col_0', 'col_1', 'col_2']
    2025-11-04 03:31:59,749 | INFO     | Total time taken by PCA: 0.63 sec
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 03:32:00,105 | INFO     | Model Training started ...
    2025-11-04 03:32:00,149 | INFO     | Starting customized hyperparameter update ...
    2025-11-04 03:32:00,149 | INFO     | Completed customized hyperparameter update.
    2025-11-04 03:32:00,149 | INFO     | Hyperparameters used for model training:
    2025-11-04 03:32:00,149 | INFO     | Model: xgboost
    2025-11-04 03:32:00,149 | INFO     | Hyperparameters: {'response_column': 'species', 'name': 'xgboost', 'model_type': 'Classification', 'column_sampling': (1, 0.6), 'min_impurity': (0.0, 0.1), 'lambda1': (1.0, 0.001, 0.01), 'shrinkage_factor': (0.5, 0.1, 0.2), 'max_depth': (3, 4), 'min_node_size': (1, 2), 'iter_num': (10, 20), 'num_boosted_trees': (-1, 2, 5), 'seed': 42}
    2025-11-04 03:32:00,149 | INFO     | Total number of models for xgboost: 864
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    2025-11-04 03:32:00,149 | INFO     | Performing hyperparameter tuning ...
                                                                                                                                                                 2025-11-04 03:32:01,462 | INFO     | Model training for xgboost
    2025-11-04 03:32:12,470 | INFO     | ----------------------------------------------------------------------------------------------------
    2025-11-04 03:32:12,472 | INFO     | Leaderboard
       RANK   MODEL_ID FEATURE_SELECTION  ACCURACY  MICRO-PRECISION  ...  MACRO-RECALL  MACRO-F1  WEIGHTED-PRECISION  WEIGHTED-RECALL  WEIGHTED-F1
    0     1  XGBOOST_0               rfe       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    1     2  XGBOOST_1               pca       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    2     3  XGBOOST_2               rfe       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    3     4  XGBOOST_3               pca       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    [4 rows x 13 columns]
    4 rows X 13 columns
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    >>> Completed: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿| 100% - 17/17
  5. Display model leaderboard.
    >>> aml.leaderboard()
       RANK   MODEL_ID FEATURE_SELECTION  ACCURACY  MICRO-PRECISION  ...  MACRO-RECALL  MACRO-F1  WEIGHTED-PRECISION  WEIGHTED-RECALL  WEIGHTED-F1
    0     1  XGBOOST_0               rfe       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    1     2  XGBOOST_1               pca       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    2     3  XGBOOST_2               rfe       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    3     4  XGBOOST_3               pca       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    [4 rows x 13 columns]
  6. Display the best performing model.
    >>> aml.leader()
       RANK   MODEL_ID FEATURE_SELECTION  ACCURACY  MICRO-PRECISION  ...  MACRO-RECALL  MACRO-F1  WEIGHTED-PRECISION  WEIGHTED-RECALL  WEIGHTED-F1
    0     1  XGBOOST_0               rfe       1.0              1.0  ...           1.0       1.0                 1.0              1.0          1.0
    [1 rows x 13 columns]
  7. Display model hyperparameters for trained model.
    >>> aml.model_hyperparameters(rank=1)
    {'response_column': 'species', 
      'name': 'xgboost', 
      'model_type': 'Classification', 
      'column_sampling': 1, 
      'min_impurity': 0.0, 
      'lambda1': 1.0, 
      'shrinkage_factor': 0.5, 
      'max_depth': 3, 
      'min_node_size': 1, 
      'iter_num': 10, 
      'num_boosted_trees': -1, 
      'seed': 42, 
      'persist': False, 
      'output_prob': True, 
      'output_responses': ['1', '2', '3']}
    
    >>> aml.model_hyperparameters(rank=4)
    {'response_column': 'species', 
      'name': 'xgboost', 
      'model_type': 'Classification', 
      'column_sampling': 1, 
      'min_impurity': 0.0, 
      'lambda1': 1.0, 
      'shrinkage_factor': 0.5, 
      'max_depth': 3, 
      'min_node_size': 1, 
      'iter_num': 10, 
      'num_boosted_trees': 2, 
      'seed': 42, 
      'persist': False, 
      'output_prob': True, 
      'output_responses': ['1', '2', '3']}
    
  8. Generate prediction on test dataset using best performing model.
    >>> prediction = aml.predict(iris_test)
    2025-11-04 03:35:18,195 | INFO     | Data Transformation started ...
    2025-11-04 03:35:18,195 | INFO     | Performing transformation carried out in feature engineering phase ...
    2025-11-04 03:35:18,947 | INFO     | Updated dataset after performing target column transformation :
        id  sepal_length  sepal_width  petal_length  petal_width  species  automl_id
    0   74           6.1          2.8           4.7          1.2        2         15
    1   62           5.9          3.0           4.2          1.5        2         10
    2   37           5.5          3.5           1.3          0.2        1         14
    3  101           6.3          3.3           6.0          2.5        3          5
    4  106           7.6          3.0           6.6          2.1        3         13
    5   78           6.7          3.0           5.0          1.7        2          4
    6  116           6.4          3.2           5.3          2.3        3          8
    7   43           4.4          3.2           1.3          0.2        1         12
    8   66           6.7          3.1           4.4          1.4        2          9
    9   40           5.1          3.4           1.5          0.2        1          6
    30 rows X 7 columns
    2025-11-04 03:35:19,802 | INFO     | Performing transformation carried out in data preparation phase ...
    2025-11-04 03:35:20,930 | INFO     | Updated dataset after performing RFE feature selection:
                id  petal_length  species
    automl_id
    13         106           6.6        3
    5          101           6.0        3
    24          18           1.4        1
    7          122           4.9        3
    12          43           1.3        1
    19         117           5.5        3
    15          74           4.7        2
    30          67           4.5        2
    22          92           4.6        2
    26         149           5.4        3
    30 rows X 4 columns
    2025-11-04 03:35:21,953 | INFO     | Updated dataset after performing scaling on RFE selected features :
       automl_id  species      r_id  r_petal_length
    0         13        3  0.665588        1.603450
    1         15        2 -0.057886        0.529611
    2         30        2 -0.216146        0.416576
    3          7        3  1.027325        0.642647
    4         12        1 -0.758751       -1.391996
    5         26        3  1.637756        0.925236
    6          5        3  0.552545        1.264343
    7         24        1 -1.323965       -1.335478
    8         22        2  0.349068        0.473093
    9         19        3  0.914282        0.981754
    30 rows X 4 columns
    2025-11-04 03:35:23,628 | INFO     | Updated dataset after performing scaling for PCA feature selection :
       automl_id  species        id  sepal_length  sepal_width  petal_length  petal_width
    0         13        3  0.665588      2.038017    -0.099951      1.603450     1.200958
    1         15        2 -0.057886      0.291145    -0.583050      0.529611     0.006863
    2         30        2 -0.216146     -0.291145    -0.099951      0.416576     0.404894
    3          7        3  1.027325     -0.291145    -0.583050      0.642647     1.068281
    4         12        1 -0.758751     -1.688643     0.383147     -1.391996    -1.319910
    5         26        3  1.637756      0.407603     0.866246      0.925236     1.466313
    6          5        3  0.552545      0.524062     0.624696      1.264343     1.731667
    7         24        1 -1.323965     -0.873436     1.107795     -1.335478    -1.187233
    8         22        2  0.349068      0.291145    -0.099951      0.473093     0.272217
    9         19        3  0.914282      0.756978    -0.099951      0.981754     0.802926
    30 rows X 7 columns
    2025-11-04 03:35:24,048 | INFO     | Updated dataset after performing PCA feature selection :
       automl_id     col_0     col_1     col_2  species
    0         26  1.974797  1.035059  1.242515        3
    1         17  0.480275 -0.147331 -0.514093        2
    2          7  1.362878 -0.583179  0.807251        3
    3         19  1.706582  0.268332  0.131765        3
    4          5  1.866732  0.932092  0.324474        3
    5         34  1.928278  0.322988 -0.003286        3
    6         22  0.699742  0.048328  0.010950        2
    7         15  0.512768 -0.395743 -0.464033        2
    8         24 -2.565497  0.582561 -0.071086        1
    9         13  2.679685  0.832049 -0.857995        3
    10 rows X 5 columns
    2025-11-04 03:35:24,336 | INFO     | Data Transformation completed.⫿⫿⫿⫿⫿| 100% - 14/14
    2025-11-04 03:35:24,880 | INFO     | Following model is being picked for evaluation:
    2025-11-04 03:35:24,880 | INFO     | Model ID : XGBOOST_0
    2025-11-04 03:35:24,880 | INFO     | Feature Selection Method : rfe
    2025-11-04 03:35:25,406 | INFO     | Applying SHAP for Model Interpretation...
    2025-11-04 03:35:28,354 | INFO     | SHAP Analysis Completed. Feature Importance Available.
    /root/automl_testing/pyTeradata/teradataml/automl/model_evaluation.py:380: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
      plt.show()
    2025-11-04 03:35:28,475 | INFO     | Prediction :
       automl_id  Prediction  species    prob_1    prob_2    prob_3
    0         30           2        2  0.007360  0.985279  0.007362
    1          5           3        3  0.006796  0.006882  0.986322
    2         24           1        1  0.985980  0.007056  0.006964
    3         17           2        2  0.007360  0.985279  0.007362
    4         13           3        3  0.006796  0.006882  0.986322
    5          7           3        3  0.006796  0.006882  0.986322
    6         22           2        2  0.007360  0.985279  0.007362
    7         12           1        1  0.985980  0.007056  0.006964
    8         34           3        3  0.006796  0.006882  0.986322
    9         26           3        3  0.006796  0.006882  0.986322
    2025-11-04 03:35:29,400 | INFO     | Confusion Matrix :
    [[ 8  0  0]
     [ 0 12  0]
    >>> prediction.head()
       automl_id  Prediction  species    prob_1    prob_2    prob_3
    0         13           3        3  0.006796  0.006882  0.986322
    1          5           3        3  0.006796  0.006882  0.986322
    2         24           1        1  0.985980  0.007056  0.006964
    3         19           3        3  0.006796  0.006882  0.986322
    4         30           2        2  0.007360  0.985279  0.007362
    5          7           3        3  0.006796  0.006882  0.986322
    6         22           2        2  0.007360  0.985279  0.007362
    7         12           1        1  0.985980  0.007056  0.006964
    8         15           2        2  0.007360  0.985279  0.007362
    9         26           3        3  0.006796  0.006882  0.986322
  9. Generate evaluation metrics on test dataset using best performing model.
    >>> performance_metrics = aml.evaluate(iris_test)
    2025-11-04 03:40:14,901 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 03:40:15,431 | INFO     | Following model is being picked for evaluation:
    2025-11-04 03:40:15,431 | INFO     | Model ID : XGBOOST_0
    2025-11-04 03:40:15,431 | INFO     | Feature Selection Method : rfe
    2025-11-04 03:40:18,047 | INFO     | Performance Metrics :
           Prediction  Mapping  CLASS_1  CLASS_2  CLASS_3  Precision  Recall   F1  Support
    SeqNum
    0               1  CLASS_1        8        0        0        1.0     1.0  1.0        8
    2               3  CLASS_3        0        0       10        1.0     1.0  1.0       10
    1               2  CLASS_2        0       12        0        1.0     1.0  1.0       12
    --------------------------------------------------------------------------------
       SeqNum              Metric  MetricValue
    0       3        Micro-Recall          1.0
    1       5     Macro-Precision          1.0
    2       6        Macro-Recall          1.0
    3       7            Macro-F1          1.0
    4       9     Weighted-Recall          1.0
    5      10         Weighted-F1          1.0
    6       8  Weighted-Precision          1.0
    7       4            Micro-F1          1.0
    8       2     Micro-Precision          1.0
    9       1            Accuracy          1.0
    >>> performance_metrics
           Prediction  Mapping  CLASS_1  CLASS_2  CLASS_3  Precision  Recall   F1  Support
    SeqNum
    0               1  CLASS_1        8        0        0        1.0     1.0  1.0        8
    2               3  CLASS_3        0        0       10        1.0     1.0  1.0       10
    1               2  CLASS_2        0       12        0        1.0     1.0  1.0       12
  10. Generate prediction on test dataset using second best performing model.
    >>> prediction = aml.predict(iris_test,2)
    2025-11-04 03:54:52,175 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 03:54:52,719 | INFO     | Following model is being picked for evaluation:
    2025-11-04 03:54:52,720 | INFO     | Model ID : XGBOOST_1
    2025-11-04 03:54:52,720 | INFO     | Feature Selection Method : pca
    2025-11-04 03:54:53,246 | INFO     | Applying SHAP for Model Interpretation...
    2025-11-04 03:54:55,151 | INFO     | SHAP Analysis Completed. Feature Importance Available.
    /root/automl_testing/pyTeradata/teradataml/automl/model_evaluation.py:380: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
      plt.show()
    2025-11-04 03:54:55,260 | INFO     | Prediction :
       automl_id  Prediction  species    prob_1    prob_2    prob_3
    0          7           3        3  0.031410  0.016004  0.952586
    1          5           3        3  0.007274  0.007357  0.985369
    2         34           3        3  0.007274  0.007357  0.985369
    3         22           2        2  0.007344  0.984609  0.008048
    4         24           1        1  0.986968  0.007161  0.005871
    5         13           3        3  0.007274  0.007357  0.985369
    6         15           2        2  0.007358  0.986465  0.006177
    7         19           3        3  0.007274  0.007357  0.985369
    8         17           2        2  0.007358  0.986465  0.006177
    9         26           3        3  0.045905  0.039178  0.914917
    2025-11-04 03:54:55,596 | INFO     | Confusion Matrix :
    [[ 8  0  0]
     [ 0 12  0]
     [ 0  0 10]]
    >>> prediction.head()
       automl_id  Prediction  species    prob_1    prob_2    prob_3
    0          7           3        3  0.031410  0.016004  0.952586
    1          5           3        3  0.007274  0.007357  0.985369
    2         34           3        3  0.007274  0.007357  0.985369
    3         22           2        2  0.007344  0.984609  0.008048
    4         24           1        1  0.986968  0.007161  0.005871
    5         13           3        3  0.007274  0.007357  0.985369
    6         15           2        2  0.007358  0.986465  0.006177
    7         19           3        3  0.007274  0.007357  0.985369
    8         17           2        2  0.007358  0.986465  0.006177
    9         26           3        3  0.045905  0.039178  0.914917
  11. Generate evaluation metrics on test dataset using second best performing model.
    >>> performance_metrics = aml.evaluate(iris_test,2)
    2025-11-04 03:55:32,151 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 03:55:32,702 | INFO     | Following model is being picked for evaluation:
    2025-11-04 03:55:32,703 | INFO     | Model ID : XGBOOST_1
    2025-11-04 03:55:32,703 | INFO     | Feature Selection Method : pca
    2025-11-04 03:55:35,199 | INFO     | Performance Metrics :
           Prediction  Mapping  CLASS_1  CLASS_2  CLASS_3  Precision  Recall   F1  Support
    SeqNum
    0               1  CLASS_1        8        0        0        1.0     1.0  1.0        8
    2               3  CLASS_3        0        0       10        1.0     1.0  1.0       10
    1               2  CLASS_2        0       12        0        1.0     1.0  1.0       12
    --------------------------------------------------------------------------------
       SeqNum              Metric  MetricValue
    0       3        Micro-Recall          1.0
    1       5     Macro-Precision          1.0
    2       6        Macro-Recall          1.0
    3       7            Macro-F1          1.0
    4       9     Weighted-Recall          1.0
    5      10         Weighted-F1          1.0
    6       8  Weighted-Precision          1.0
    7       4            Micro-F1          1.0
    8       2     Micro-Precision          1.0
    9       1            Accuracy          1.0
    >>> performance_metrics
           Prediction  Mapping  CLASS_1  CLASS_2  CLASS_3  Precision  Recall   F1  Support
    SeqNum
    0               1  CLASS_1        8        0        0        1.0     1.0  1.0        8
    2               3  CLASS_3        0        0       10        1.0     1.0  1.0       10
    1               2  CLASS_2        0       12        0        1.0     1.0  1.0       12