AutoML for regression with early stopping timer and metrics threshold - Example 1: Run AutoML for Regression Problem with Early Stopping Timer and Metrics Threshold - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2026-01-07
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

This example predicts the price of house based on different factors.

Run AutoML to get the best performing model with the following specifications:
  • Set early stopping criteria, that is, time limit to 100 sec and performance metrics R2 threshold value to 0.7.
  • Exclude 'knn', 'glm', and 'svm' models from default model training list.
  • Opt for verbose level 2 to get detailed logging.
  1. Load the example dataset.
    >>> load_example_data("decisionforestpredict", ["housing_train", "housing_test"])
    >>> housing_train = DataFrame.from_table("housing_train")
    >>> housing_test = DataFrame.from_table("housing_test")
  2. Create an AutoML instance.
    >>> aml = AutoML(task_type="Regression",
                     exclude=['knn', 'glm', 'svm'],
                     verbose=2,
                     max_runtime_secs=100,
                     stopping_metric='R2',
                     stopping_tolerance=0.7
                     seed=42)
  3. Fit the data.
    >>> aml.fit(housing_train,housing_train.price)
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:32:53,728 | INFO     | Feature Exploration started
    2025-11-04 01:32:53,729 | INFO     | Data Overview:
    2025-11-04 01:32:53,750 | INFO     | Total Rows in the data: 492
    2025-11-04 01:32:53,772 | INFO     | Total Columns in the data: 14
    2025-11-04 01:32:54,383 | INFO     | Column Summary:
       ColumnName                         Datatype  NonNullCount  NullCount  BlankCount  ZeroCount  PositiveCount  NegativeCount  NullPercentage  NonNullPercentage
    0    prefarea  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    1     lotsize                            FLOAT           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    2    bedrooms                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    3     stories                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    4          sn                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    5    driveway  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    6       gashw  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    7       airco  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    8    garagepl                          INTEGER           492          0         NaN      270.0          222.0            0.0             0.0              100.0
    9   homestyle  VARCHAR(20) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    10   fullbase  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    11      price                            FLOAT           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    12    recroom  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    13    bathrms                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    2025-11-04 01:32:55,156 | INFO     | Statistics of Data:
      ATTRIBUTE StatName  StatValue
    0   bathrms  MAXIMUM        4.0
    1  bedrooms  MINIMUM        1.0
    2  bedrooms  MAXIMUM        6.0
    3        sn    COUNT      492.0
    4        sn  MAXIMUM      546.0
    5  garagepl    COUNT      492.0
    6  garagepl  MINIMUM        0.0
    7  garagepl  MAXIMUM        3.0
    8        sn  MINIMUM        1.0
    9  bedrooms    COUNT      492.0
    2025-11-04 01:32:55,306 | INFO     | Categorical Columns with their Distinct values:
    ColumnName                DistinctValueCount
    driveway                  2
    recroom                   2
    fullbase                  2
    gashw                     2
    airco                     2
    prefarea                  2
    homestyle                 3
    2025-11-04 01:32:58,031 | INFO     | No Futile columns found.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 01:33:00,943 | INFO     | Columns with outlier percentage :-
      ColumnName  OutlierPercentage
    0    bathrms           0.203252
    1    stories           7.113821
    2   garagepl           2.235772
    3   bedrooms           2.235772
    4    lotsize           2.235772
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:33:01,171 | INFO     | Feature Engineering started ...
    2025-11-04 01:33:01,171 | INFO     | Handling duplicate records present in dataset ...
    2025-11-04 01:33:01,350 | INFO     | Analysis completed. No action taken.
    2025-11-04 01:33:01,350 | INFO     | Total time to handle duplicate records: 0.18 sec
    2025-11-04 01:33:01,351 | INFO     | Handling less significant features from data ...
    2025-11-04 01:33:04,939 | INFO     | Analysis indicates all categorical columns are significant. No action Needed.
    2025-11-04 01:33:04,940 | INFO     | Total time to handle less significant features: 3.59 sec
    2025-11-04 01:33:04,940 | INFO     | Handling Date Features ...
    2025-11-04 01:33:04,940 | INFO     | Analysis Completed. Dataset does not contain any feature related to dates. No action needed.
    2025-11-04 01:33:04,940 | INFO     | Total time to handle date features: 0.00 sec
    2025-11-04 01:33:04,940 | INFO     | Checking Missing values in dataset ...
    2025-11-04 01:33:06,795 | INFO     | Analysis Completed. No Missing Values Detected.
    2025-11-04 01:33:06,796 | INFO     | Total time to find missing values in data: 1.86 sec
    2025-11-04 01:33:06,796 | INFO     | Imputing Missing Values ...
    2025-11-04 01:33:06,796 | INFO     | Analysis completed. No imputation required.
    2025-11-04 01:33:06,796 | INFO     | Time taken to perform imputation: 0.00 sec
    2025-11-04 01:33:06,797 | INFO     | Performing encoding for categorical columns ...
    2025-11-04 01:33:10,662 | INFO     | ONE HOT Encoding these Columns:
    ['driveway', 'recroom', 'fullbase', 'gashw', 'airco', 'prefarea', 'homestyle']
    2025-11-04 01:33:10,663 | INFO     | Sample of dataset after performing one hot encoding:
            price  lotsize  bedrooms  bathrms  stories  driveway_0  driveway_1  recroom_0  recroom_1  fullbase_0  fullbase_1  gashw_0  gashw_1  airco_0  airco_1  garagepl  prefarea_0  prefarea_1  homestyle_0  homestyle_1  homestyle_2  automl_id
    sn                                                                                                                                                           
    488   44100.0   8100.0         2        1        1           0           1          1          0           1           0        1        0        1        0         1           1           0            0            1            0         14
    345   88000.0   4500.0         3        1        4           0           1          1          0           1           0        1        0        0        1         0           1           0            0            0            1         22
    406   86000.0   6900.0         3        2        1           0           1          0          1           0           1        1        0        1        0         0           0           1            0            0            1         26
    528  106000.0   6325.0         3        1        4           0           1          1          0           1           0        1        0        0        1         1           1           0            1            0            0         30
    446  104900.0  11440.0         4        1        2           0           1          1          0           0           1        1        0        1        0         1           0           1            1            0            0         38
    343   80000.0  10500.0         2        1        1           0           1          1          0           1           0        1        0        1        0         1           1           0            0            0            1         42
    120  116000.0   6840.0         5        1        2           0           1          0          1           0           1        1        0        0        1         1           1           0            1            0            0         34
    80    63900.0   6360.0         2        1        1           0           1          1          0           0           1        1        0        0        1         1           1           0            0            0            1         18
    223   70100.0   4200.0         3        1        2           0           1          1          0           1           0        1        0        1        0         1           1           0            0            0            1         10
    40    54500.0   3150.0         2        2        1           1           0          1          0           0           1        1        0        1        0         0           1           0            0            0            1          6
    492 rows X 23 columns
    2025-11-04 01:33:10,755 | INFO     | Time taken to encode the columns: 3.96 sec
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:33:10,756 | INFO     | Data preparation started ...
    2025-11-04 01:33:10,756 | INFO     | Outlier preprocessing ...
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 01:33:14,094 | INFO     | Columns with outlier percentage :-
      ColumnName  OutlierPercentage
    0   bedrooms           2.235772
    1   garagepl           2.235772
    2    bathrms           0.203252
    3    lotsize           2.235772
    4    stories           7.113821
    2025-11-04 01:33:14,628 | INFO     | Deleting rows of these columns:
    ['lotsize', 'stories', 'bedrooms', 'bathrms', 'garagepl']
    2025-11-04 01:33:16,825 | INFO     | Sample of dataset after removing outlier rows:
            price  lotsize  bedrooms  bathrms  stories  driveway_0  driveway_1  recroom_0  recroom_1  fullbase_0  fullbase_1  gashw_0  gashw_1  airco_0  airco_1  garagepl  prefarea_0  prefarea_1  homestyle_0  homestyle_1  homestyle_2  automl_id
    sn                                                                                                                                                           
    19    45000.0   3450.0         1        1        1           0           1          1          0           1           0        1        0        1        0         0           1           0            0            1            0         19
    59    35500.0   4400.0         3        1        2           0           1          1          0           1           0        1        0        1        0         0           1           0            0            1            0         27
    324   98000.0   6000.0         3        1        1           0           1          1          0           1           0        1        0        0        1         1           1           0            0            0            1         31
    385   78000.0   6600.0         4        2        2           0           1          0          1           0           1        1        0        1        0         0           0           1            0            0            1         35
    99    35000.0   3500.0         2        1        1           0           1          0          1           1           0        1        0        1        0         0           1           0            0            1            0         43
    160   63000.0   3968.0         3        1        2           1           0          1          0           1           0        1        0        1        0         0           1           0            0            0            1         47
    303   58500.0   4040.0         2        1        2           0           1          1          0           1           0        1        0        1        0         1           1           0            0            0            1         39
    263   48500.0   3640.0         2        1        1           0           1          1          0           1           0        1        0        1        0         0           1           0            0            1            0         23
    448  120000.0   5500.0         4        2        2           0           1          1          0           0           1        1        0        0        1         1           0           1            1            0            0         15
    122   80000.0  10500.0         4        2        2           0           1          1          0           1           0        1        0        1        0         1           1           0            0            0            1          7
    428 rows X 23 columns
    2025-11-04 01:33:16,984 | INFO     | Time Taken by Outlier processing: 6.23 sec
    2025-11-04 01:33:18,064 | INFO     | Feature selection using rfe ...
    2025-11-04 01:34:01,319 | INFO     | feature selected by RFE:
    ['homestyle_0', 'stories', 'bathrms', 'homestyle_2', 'prefarea_0', 'bedrooms', 'homestyle_1', 'airco_0', 'garagepl', 'fullbase_0', 'sn', 'lotsize']
    2025-11-04 01:34:01,321 | INFO     | Total time taken by feature selection: 43.26 sec
    2025-11-04 01:34:01,634 | INFO     | Scaling Features of rfe data ...
    2025-11-04 01:34:03,368 | INFO     | columns that will be scaled:
    ['r_stories', 'r_bathrms', 'r_bedrooms', 'r_garagepl', 'r_sn', 'r_lotsize']
    2025-11-04 01:34:05,459 | INFO     | Dataset sample after scaling:
       r_homestyle_2  r_homestyle_0  automl_id    price  r_homestyle_1  r_fullbase_0  r_airco_0  r_prefarea_0  r_stories  r_bathrms  r_bedrooms  r_garagepl      r_sn  r_lotsize
    0              1              0          6  54500.0              0             0          1             1  -1.000531   1.626355   -1.327527   -0.769077 -1.415712  -0.930075
    1              1              0          8  99000.0              0             0          0             1   0.579644   1.626355    0.187624    0.516724  0.441857   2.233189
    2              0              0          9  27000.0              1             1          1             1  -1.000531  -0.522040   -1.327527   -0.769077 -0.090733  -0.654601
    3              1              0         10  70100.0              0             1          1             1   0.579644  -0.522040    0.187624    0.516724 -0.227128  -0.350420
    4              1              0         13  60000.0              0             1          1             1  -1.000531  -0.522040    0.187624    1.802525  0.305462   0.532865
    5              0              0         14  44100.0              1             1          1             1  -1.000531  -0.522040   -1.327527    0.516724  1.494047   1.802588
    6              1              0         12  58000.0              0             1          1             1  -1.000531  -0.522040    0.187624   -0.769077 -0.486928  -0.273132
    7              1              0          7  80000.0              0             1          1             1   0.579644   1.626355    1.702775    0.516724 -0.883122   3.127515
    8              0              0          5  50000.0              1             1          1             1  -1.000531  -0.522040   -1.327527    0.516724  0.045662  -0.659569
    9              0              0          4  48000.0              1             1          1             1   0.579644  -0.522040   -1.327527   -0.769077 -1.279317  -0.394584
    428 rows X 14 columns
    2025-11-04 01:34:06,093 | INFO     | Total time taken by feature scaling: 4.46 sec
    2025-11-04 01:34:06,093 | INFO     | Scaling Features of pca data ...
    2025-11-04 01:34:08,162 | INFO     | columns that will be scaled:
    ['sn', 'lotsize', 'bedrooms', 'bathrms', 'stories', 'garagepl']
    2025-11-04 01:34:10,316 | INFO     | Dataset sample after scaling:
       homestyle_0  homestyle_1  fullbase_1  driveway_0  airco_0  recroom_1  airco_1  gashw_0  homestyle_2     price  automl_id  prefarea_0  prefarea_1  gashw_1  fullbase_0  driveway_1  recroom_0        sn   lotsize  bedrooms   bathrms   stories  garagepl
    0            0            1           0           0        1          0        0        1            0   44100.0         14           1           0        0           1           1          1  1.494047  1.802588 -1.327527 -0.522040 -1.000531  0.516724
    1            1            0           1           0        0          0        1        1            0  120000.0         15           0           1        0           0           1          1  1.234247  0.367250  1.702775  1.626355  0.579644  0.516724
    2            0            1           0           0        1          0        0        1            0   45000.0         19           1           0        0           1           1          1 -1.552107 -0.764460 -2.842678 -0.522040 -1.000531 -0.769077
    3            0            1           0           0        1          0        0        1            0   50000.0          5           1           0        0           1           1          1  0.045662 -0.659569 -1.327527 -0.522040 -1.000531  0.516724
    4            0            0           0           0        1          0        0        0            1   60000.0         13           1           0        1           1           1          1  0.305462  0.532865  0.187624 -0.522040 -1.000531  1.802525
    5            0            1           0           0        1          0        0        1            0   48000.0          4           1           0        0           1           1          1 -1.279317 -0.394584 -1.327527 -0.522040  0.579644 -0.769077
    6            0            0           1           0        0          0        1        1            1   99000.0          8           1           0        0           0           1          1  0.441857  2.233189  0.187624  1.626355  0.579644  0.516724
    7            0            0           0           0        1          0        0        1            1   58000.0         12           1           0        0           1           1          1 -0.486928 -0.273132  0.187624 -0.522040 -1.000531 -0.769077
    8            0            1           0           0        1          0        0        1            0   27000.0          9           1           0        0           1           1          1 -0.090733 -0.654601 -1.327527 -0.522040 -1.000531 -0.769077
    9            0            0           0           0        1          0        0        1            1   80000.0          7           1           0        0           1           1          1 -0.883122  3.127515  1.702775  1.626355  0.579644  0.516724
    428 rows X 23 columns
    2025-11-04 01:34:11,006 | INFO     | Total time taken by feature scaling: 4.91 sec
    2025-11-04 01:34:11,006 | INFO     | Dimension Reduction using pca ...
    2025-11-04 01:34:11,661 | INFO     | PCA columns:
    ['col_0', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5', 'col_6', 'col_7', 'col_8', 'col_9', 'col_10']
    2025-11-04 01:34:11,662 | INFO     | Total time taken by PCA: 0.66 sec
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:34:12,067 | INFO     | Model Training started ...
    2025-11-04 01:34:12,110 | INFO     | Hyperparameters used for model training:
    2025-11-04 01:34:12,110 | INFO     | Model: decision_forest
    2025-11-04 01:34:12,110 | INFO     | Hyperparameters: {'response_column': 'price', 'name': 'decision_forest', 'tree_type': 'Regression', 'min_impurity': (0.0, 0.1, 0.2, 0.3), 'max_depth': (5, 3, 4, 7, 8), 'min_node_size': (1, 2, 3, 4), 'num_trees': (-1,), 'seed': 42}
    2025-11-04 01:34:12,110 | INFO     | Total number of models for decision_forest: 80
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    2025-11-04 01:34:12,110 | INFO     | Model: xgboost
    2025-11-04 01:34:12,111 | INFO     | Hyperparameters: {'response_column': 'price', 'name': 'xgboost', 'model_type': 'Regression', 'column_sampling': (1, 0.6), 'min_impurity': (0.0, 0.1, 0.2, 0.3), 'lambda1': (1.0, 1.0, 10.0, 100.0), 'shrinkage_factor': (0.5, 0.01, 0.05, 0.1), 'max_depth': (5, 3, 4, 7, 8), 'min_node_size': (1, 2, 3, 4), 'iter_num': (10, 20, 30, 40), 'num_boosted_trees': (-1, 20, 50, 100), 'seed': 42}
    2025-11-04 01:34:12,124 | INFO     | Total number of models for xgboost: 40960
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    2025-11-04 01:34:12,124 | INFO     | Performing hyperparameter tuning ...
                                                                                                                                                                 2025-11-04 01:34:13,215 | INFO     | Model training for decision_forest
    2025-11-04 01:34:38,983 | INFO     | ----------------------------------------------------------------------------------------------------
                                                                                                                                                                 2025-11-04 01:34:38,983 | INFO     | Model training for xgboost
    2025-11-04 01:35:04,641 | INFO     | ----------------------------------------------------------------------------------------------------
    2025-11-04 01:35:04,644 | INFO     | Leaderboard
        RANK          MODEL_ID FEATURE_SELECTION           MAE           MSE      MSLE  ...             ME        R2        EV          MPD       MGD  ADJUSTED_R2
    0      1  DECISIONFOREST_0               rfe   9531.052379  1.521203e+08  0.034230  ...   37142.567568  0.683602  0.684739  2201.511486  0.034267     0.674453
    1      2         XGBOOST_0               rfe   7822.466077  1.521448e+08  0.024162  ...   74337.127715  0.683551  0.688615  1818.809847  0.025643     0.674400
    2      3  DECISIONFOREST_2               rfe   9593.067883  1.557999e+08  0.034362  ...   37142.567568  0.675949  0.676903  2224.474645  0.034410     0.666578
    3      4  DECISIONFOREST_4               rfe   9593.067883  1.557999e+08  0.034362  ...   37142.567568  0.675949  0.676903  2224.474645  0.034410     0.666578
    4      5  DECISIONFOREST_5               pca  10263.676061  1.800835e+08  0.038719  ...   40566.666667  0.625441  0.630034  2554.009956  0.039661     0.615536
    5      6         XGBOOST_2               rfe   8126.079052  1.901138e+08  0.025395  ...   95704.674365  0.604578  0.625691  2127.215276  0.027426     0.593145
    6      7  DECISIONFOREST_3               pca  10722.659249  2.004692e+08  0.041859  ...   40566.666667  0.583040  0.584328  2786.784228  0.042189     0.572014
    7      8  DECISIONFOREST_1               pca  10669.171897  2.102944e+08  0.043649  ...   51700.000000  0.562604  0.565019  2918.761297  0.044867     0.551039
    8      9         XGBOOST_1               pca   9304.694891  2.102998e+08  0.033842  ...   88307.046227  0.562593  0.562641  2485.090162  0.033954     0.551027
    9     10         XGBOOST_3               pca   9578.433215  2.361295e+08  0.036369  ...   99170.952412  0.508869  0.515650  2800.835696  0.037728     0.495883
    10    11         XGBOOST_5               pca   9461.591069  2.401957e+08  0.037042  ...  106111.874202  0.500412  0.515578  2894.032774  0.039889     0.487202
    11    12         XGBOOST_7               pca  10067.123344  2.698733e+08  0.042398  ...  111924.901906  0.438685  0.468630  3326.063729  0.046489     0.423843
    12    13         XGBOOST_6               rfe  12950.097039  3.982805e+08  0.065653  ...  115851.893132  0.171608  0.333218  5232.883191  0.074837     0.147655
    13    14         XGBOOST_4               rfe  15210.126908  4.912907e+08  0.088076  ...  128690.810653 -0.021846  0.147882  6790.938290  0.101301    -0.051393
    [14 rows x 16 columns]
    14 rows X 16 columns
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    >>> Completed: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿| 100% - 13/13
  4. Display model leaderboard.
    >>> aml.leaderboard()
        RANK          MODEL_ID FEATURE_SELECTION           MAE           MSE      MSLE  ...             ME        R2        EV          MPD       MGD  ADJUSTED_R2
    0      1  DECISIONFOREST_0               rfe   9531.052379  1.521203e+08  0.034230  ...   37142.567568  0.683602  0.684739  2201.511486  0.034267     0.674453
    1      2         XGBOOST_0               rfe   7822.466077  1.521448e+08  0.024162  ...   74337.127715  0.683551  0.688615  1818.809847  0.025643     0.674400
    2      3  DECISIONFOREST_2               rfe   9593.067883  1.557999e+08  0.034362  ...   37142.567568  0.675949  0.676903  2224.474645  0.034410     0.666578
    3      4  DECISIONFOREST_4               rfe   9593.067883  1.557999e+08  0.034362  ...   37142.567568  0.675949  0.676903  2224.474645  0.034410     0.666578
    4      5  DECISIONFOREST_5               pca  10263.676061  1.800835e+08  0.038719  ...   40566.666667  0.625441  0.630034  2554.009956  0.039661     0.615536
    5      6         XGBOOST_2               rfe   8126.079052  1.901138e+08  0.025395  ...   95704.674365  0.604578  0.625691  2127.215276  0.027426     0.593145
    6      7  DECISIONFOREST_3               pca  10722.659249  2.004692e+08  0.041859  ...   40566.666667  0.583040  0.584328  2786.784228  0.042189     0.572014
    7      8  DECISIONFOREST_1               pca  10669.171897  2.102944e+08  0.043649  ...   51700.000000  0.562604  0.565019  2918.761297  0.044867     0.551039
    8      9         XGBOOST_1               pca   9304.694891  2.102998e+08  0.033842  ...   88307.046227  0.562593  0.562641  2485.090162  0.033954     0.551027
    9     10         XGBOOST_3               pca   9578.433215  2.361295e+08  0.036369  ...   99170.952412  0.508869  0.515650  2800.835696  0.037728     0.495883
    10    11         XGBOOST_5               pca   9461.591069  2.401957e+08  0.037042  ...  106111.874202  0.500412  0.515578  2894.032774  0.039889     0.487202
    11    12         XGBOOST_7               pca  10067.123344  2.698733e+08  0.042398  ...  111924.901906  0.438685  0.468630  3326.063729  0.046489     0.423843
    12    13         XGBOOST_6               rfe  12950.097039  3.982805e+08  0.065653  ...  115851.893132  0.171608  0.333218  5232.883191  0.074837     0.147655
    13    14         XGBOOST_4               rfe  15210.126908  4.912907e+08  0.088076  ...  128690.810653 -0.021846  0.147882  6790.938290  0.101301    -0.051393
    [14 rows x 16 columns]
  5. Display the best performing model.
    >>> aml.leader()
       RANK          MODEL_ID FEATURE_SELECTION          MAE           MSE     MSLE  ...            ME        R2        EV          MPD       MGD  ADJUSTED_R2
    0     1  DECISIONFOREST_0               rfe  9531.052379  1.521203e+08  0.03423  ...  37142.567568  0.683602  0.684739  2201.511486  0.034267     0.674453
    [1 rows x 16 columns]
    
  6. Display hyperparameters for trained model.
    1. Display model hyperparameters for rank 2.
      >>> aml.model_hyperparameters(rank=2)
      {'response_column': 'price', 
        'name': 'xgboost', 
        'model_type': 'Regression', 
        'column_sampling': 1, 
        'min_impurity': 0.0, 
        'lambda1': 1.0, 
        'shrinkage_factor': 0.5, 
        'max_depth': 5, 
        'min_node_size': 1, 
        'iter_num': 10, 
        'num_boosted_trees': -1, 
        'seed': 42, 'persist': False}
      
    2. Display model hyperparameters for rank 6.
      >>> aml.model_hyperparameters(rank=6)
      {'response_column': 'price', 
        'name': 'xgboost', 
        'model_type': 'Regression', 
        'column_sampling': 1, 
        'min_impurity': 0.0, 
        'lambda1': 1.0, 
        'shrinkage_factor': 0.5, 
        'max_depth': 5, 
        'min_node_size': 1, 
        'iter_num': 10, 
        'num_boosted_trees': 20, 
        'seed': 42, 
        'persist': False}
      
  7. Generate prediction on test dataset using best performing model.
    >>> prediction = aml.predict(housing_test)
    2025-11-04 01:38:20,081 | INFO     | Data Transformation started ...
    2025-11-04 01:38:20,081 | INFO     | Performing transformation carried out in feature engineering phase ...
    2025-11-04 01:38:23,789 | INFO     | Updated dataset after performing categorical encoding :
           price  lotsize  bedrooms  bathrms  stories  driveway_0  driveway_1  recroom_0  recroom_1  fullbase_0  fullbase_1  gashw_0  gashw_1  airco_0  airco_1  garagepl  prefarea_0  prefarea_1  homestyle_0  homestyle_1  homestyle_2  automl_id
    sn                                                                                                                                                           
    459  44555.0   2398.0         3        1        1           0           1          1          0           1           0        1        0        1        0         0           0           1            0            1            0         13
    38   67000.0   5170.0         3        1        4           0           1          1          0           1           0        1        0        0        1         0           1           0            0            0            1          8
    364  72000.0  10700.0         3        1        2           0           1          0          1           0           1        1        0        1        0         0           1           0            0            0            1         12
    177  70000.0   5400.0         4        1        2           0           1          1          0           1           0        1        0        1        0         0           1           0            0            0            1          7
    440  69000.0   6862.0         3        1        2           0           1          1          0           1           0        1        0        0        1         2           0           1            0            0            1         15
    463  49000.0   2610.0         3        1        2           0           1          1          0           0           1        1        0        1        0         0           0           1            0            1            0          6
    255  61000.0   4360.0         4        1        2           0           1          1          0           1           0        1        0        1        0         0           1           0            0            0            1         10
    260  41000.0   6000.0         2        1        1           0           1          1          0           1           0        1        0        1        0         0           1           0            0            1            0         14
    53   68000.0   9166.0         2        1        1           0           1          1          0           0           1        1        0        0        1         2           1           0            0            0            1         11
    469  55000.0   2176.0         2        1        2           0           1          0          1           1           0        1        0        1        0         0           0           1            0            0            1          4
    46 rows X 23 columns
    2025-11-04 01:38:24,308 | INFO     | Performing transformation carried out in data preparation phase ...
    2025-11-04 01:38:25,724 | INFO     | Updated dataset after performing RFE feature selection:
                 automl_id  stories  bathrms  homestyle_2  prefarea_0  bedrooms  homestyle_1  airco_0  garagepl  fullbase_0   sn  lotsize    price
    homestyle_0
    0                   25        1        1            1           0         3            0        1         1           0  411   9000.0  90000.0
    0                   37        1        1            0           1         2            1        0         0           1   16   3185.0  37900.0
    0                   41        2        2            1           1         3            0        1         2           1  176   3630.0  57500.0
    0                   45        1        1            1           0         3            0        1         2           1  441   3520.0  51900.0
    0                   53        3        1            1           0         3            0        1         0           0  408   6420.0  87500.0
    0                    7        2        1            1           1         4            0        1         0           1  177   5400.0  70000.0
    0                   49        1        1            1           1         3            0        1         2           1  353   7980.0  78500.0
    0                   29        2        1            0           1         3            1        1         1           0  142   2650.0  40000.0
    0                   21        1        1            0           1         2            1        1         0           1  249   3500.0  44500.0
    0                   13        1        1            0           0         3            1        1         0           1  459   2398.0  44555.0
    46 rows X 14 columns
    2025-11-04 01:38:26,901 | INFO     | Updated dataset after performing scaling on RFE selected features :
       r_homestyle_2  r_homestyle_0  automl_id    price  r_homestyle_1  r_fullbase_0  r_airco_0  r_prefarea_0  r_stories  r_bathrms  r_bedrooms  r_garagepl      r_sn  r_lotsize
    0              1              0         25  90000.0              0             0          1             0  -1.000531  -0.522040    0.187624    0.516724  0.993932   2.299436
    1              0              0         37  37900.0              1             1          0             1  -1.000531  -0.522040   -1.327527   -0.769077 -1.571592  -0.910754
    2              1              0         41  57500.0              0             1          1             1   0.579644   1.626355    0.187624    1.802525 -0.532393  -0.665090
    3              1              0         45  51900.0              0             1          1             0  -1.000531  -0.522040    0.187624    1.802525  1.188782  -0.725816
    4              1              0         53  87500.0              0             0          1             0   2.159819  -0.522040    0.187624   -0.769077  0.974447   0.875138
    5              1              0          7  70000.0              0             1          1             1   0.579644  -0.522040    1.702775   -0.769077 -0.525898   0.312044
    6              1              0         49  78500.0              0             1          1             1  -1.000531  -0.522040    0.187624    1.802525  0.617222   1.736341
    7              0              0         29  40000.0              1             0          1             1   0.579644  -0.522040    0.187624    0.516724 -0.753222  -1.206102
    8              0              0         21  44500.0              1             1          1             1  -1.000531  -0.522040   -1.327527   -0.769077 -0.058258  -0.736857
    9              0              0         13  44555.0              1             1          1             0  -1.000531  -0.522040    0.187624   -0.769077  1.305692  -1.345219
    46 rows X 14 columns
    2025-11-04 01:38:28,789 | INFO     | Updated dataset after performing scaling for PCA feature selection :
       homestyle_0  homestyle_1  fullbase_1  driveway_0  airco_0  recroom_1  airco_1  gashw_0  homestyle_2    price  automl_id  prefarea_0  prefarea_1  gashw_1  fullbase_0  driveway_1  recroom_0        sn   lotsize  bedrooms   bathrms   stories  garagepl
    0            0            0           1           0        1          0        0        1            1  90000.0         25           0           1        0           0           1          1  0.993932  2.299436  0.187624 -0.522040 -1.000531  0.516724
    1            0            1           0           0        0          0        1        1            0  37900.0         37           1           0        0           1           1          1 -1.571592 -0.910754 -1.327527 -0.522040 -1.000531 -0.769077
    2            0            0           0           0        1          0        0        0            1  57500.0         41           1           0        1           1           1          1 -0.532393 -0.665090  0.187624  1.626355  0.579644  1.802525
    3            0            0           0           0        1          0        0        1            1  51900.0         45           0           1        0           1           1          1  1.188782 -0.725816  0.187624 -0.522040 -1.000531  1.802525
    4            0            0           1           0        1          0        0        1            1  87500.0         53           0           1        0           0           1          1  0.974447  0.875138  0.187624 -0.522040  2.159819 -0.769077
    5            0            0           0           0        1          0        0        1            1  70000.0          7           1           0        0           1           1          1 -0.525898  0.312044  1.702775 -0.522040  0.579644 -0.769077
    6            0            0           0           0        1          0        0        1            1  78500.0         49           1           0        0           1           1          1  0.617222  1.736341  0.187624 -0.522040 -1.000531  1.802525
    7            0            1           1           0        1          0        0        1            0  40000.0         29           1           0        0           0           1          1 -0.753222 -1.206102  0.187624 -0.522040  0.579644  0.516724
    8            0            1           0           0        1          0        0        1            0  44500.0         21           1           0        0           1           1          1 -0.058258 -0.736857 -1.327527 -0.522040 -1.000531 -0.769077
    9            0            1           0           0        1          0        0        1            0  44555.0         13           0           1        0           1           1          1  1.305692 -1.345219  0.187624 -0.522040 -1.000531 -0.769077
    46 rows X 23 columns
    2025-11-04 01:38:29,784 | INFO     | Updated dataset after performing PCA feature selection :
       automl_id     col_0     col_1     col_2     col_3     col_4     col_5     col_6     col_7     col_8     col_9    col_10    price
    0         13 -0.791863 -0.311118  1.398681 -0.572869 -1.260209 -0.080711  1.511712 -0.431964 -0.520080  0.552996 -0.024807  44555.0
    1         21 -2.002198 -0.719918  0.326292 -0.403918 -0.307726  0.614137  0.245059 -0.045012 -0.229618  0.459610 -0.263007  44500.0
    2         25  1.549625 -2.223951  0.393579 -0.221676  0.776631 -0.670459  0.503664  0.628208 -0.004174 -0.014646  1.054372  90000.0
    3         29 -0.863769  0.917218 -0.489754  0.732473 -0.710596 -0.758062 -0.093896  0.439251 -0.717270  0.589935  0.259816  40000.0
    4         37 -2.472151 -0.173876 -0.599225 -0.153213  0.440801  0.320591 -0.467513 -1.236014 -0.304159  0.693641 -0.041634  37900.0
    5         41  0.863854  0.783032 -1.724096  0.706254 -1.317798  0.922764 -0.239666  0.427848  0.740829  0.006107 -0.146211  57500.0
    6         45  0.582348 -1.337774  0.082277  0.981298 -1.841966 -0.377592  1.015298 -0.239363  0.387912  0.011629  0.147673  51900.0
    7         49  1.116200 -2.040679 -0.596862  1.227361  0.435827  0.177842  0.797351  0.451317  0.676395 -0.122858 -0.105491  78500.0
    8         53  1.323323  0.587434  2.050033  0.108656  0.266096 -0.296743 -0.960196  0.961279 -0.394448 -0.133120  1.123464  87500.0
    9          7  0.252334  1.255451  0.516357  0.487479  1.169412 -0.411108  0.761633  0.489646  0.825355  0.265495 -0.022434  70000.0
    10 rows X 13 columns
    2025-11-04 01:38:30,161 | INFO     | Data Transformation completed.⫿⫿⫿⫿⫿⫿⫿| 100% - 9/9
    2025-11-04 01:38:31,071 | INFO     | Following model is being picked for evaluation:
    2025-11-04 01:38:31,071 | INFO     | Model ID : DECISIONFOREST_0
    2025-11-04 01:38:31,071 | INFO     | Feature Selection Method : rfe
    2025-11-04 01:38:32,630 | INFO     | Applying SHAP for Model Interpretation...
    2025-11-04 01:38:35,656 | INFO     | SHAP Analysis Completed. Feature Importance Available.
    /root/automl_testing/pyTeradata/teradataml/automl/model_evaluation.py:380: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
      plt.show()
    2025-11-04 01:38:35,766 | INFO     | Prediction :
       automl_id    prediction  confidence_lower  confidence_upper    price
    0         25  83147.500000      83147.500000      83147.500000  90000.0
    1         37  46733.333333      46733.333333      46733.333333  37900.0
    2         41  59857.432432      59857.432432      59857.432432  57500.0
    3         45  59857.432432      59857.432432      59857.432432  51900.0
    4         53  83147.500000      83147.500000      83147.500000  87500.0
    5          7  61368.750000      61368.750000      61368.750000  70000.0
    6         49  83147.500000      83147.500000      83147.500000  78500.0
    7         29  33735.294118      33735.294118      33735.294118  40000.0
    8         21  40858.333333      40858.333333      40858.333333  44500.0
    9         13  47300.000000      47300.000000      47300.000000  44555.0
  8. Generate evaluation metrics on test dataset using best performing model.
    >>> performance_metrics = aml.evaluate(housing_test)
    2025-11-04 01:39:06,226 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 01:39:06,794 | INFO     | Following model is being picked for evaluation:
    2025-11-04 01:39:06,795 | INFO     | Model ID : DECISIONFOREST_0
    2025-11-04 01:39:06,795 | INFO     | Feature Selection Method : rfe
    2025-11-04 01:39:10,042 | INFO     | Performance Metrics :
               MAE           MSE      MSLE       MAPE       MPE         RMSE    RMSLE       ME        R2        EV         MPD      MGD
    0  5911.315908  5.049682e+07  0.016968  11.020456 -1.777284  7106.111196  0.13026  19400.0  0.843171  0.843329  865.647374  0.01655
    >>> performance_metrics
               MAE           MSE      MSLE       MAPE       MPE         RMSE    RMSLE       ME        R2        EV         MPD      MGD
    0  5911.315908  5.049682e+07  0.016968  11.020456 -1.777284  7106.111196  0.13026  19400.0  0.843171  0.843329  865.647374  0.01655
  9. Generate prediction on test dataset using second best performing model.
    >>> prediction = aml.predict(housing_test,2)
    2025-11-04 01:40:54,586 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 01:40:55,133 | INFO     | Following model is being picked for evaluation:
    2025-11-04 01:40:55,133 | INFO     | Model ID : XGBOOST_0
    2025-11-04 01:40:55,133 | INFO     | Feature Selection Method : rfe
    2025-11-04 01:40:55,904 | INFO     | Applying SHAP for Model Interpretation...
    2025-11-04 01:40:58,006 | INFO     | SHAP Analysis Completed. Feature Importance Available.
    /root/automl_testing/pyTeradata/teradataml/automl/model_evaluation.py:380: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
      plt.show()
    2025-11-04 01:40:58,099 | INFO     | Prediction :
       automl_id    Prediction  Confidence_Lower  Confidence_upper    price
    0         25  79763.286423      79763.286423      79763.286423  90000.0
    1         37  45663.242276      45663.242276      45663.242276  37900.0
    2         41  59266.237561      59266.237561      59266.237561  57500.0
    3         45  61778.865480      61778.865480      61778.865480  51900.0
    4         53  76168.669602      76168.669602      76168.669602  87500.0
    5          7  70216.282691      70216.282691      70216.282691  70000.0
    6         49  70612.043027      70612.043027      70612.043027  78500.0
    7         29  37348.314071      37348.314071      37348.314071  40000.0
    8         21  37390.393707      37390.393707      37390.393707  44500.0
    9         13  49388.806579      49388.806579      49388.806579  44555.0
    >>> prediction.head()
       automl_id    Prediction  Confidence_Lower  Confidence_upper    price
    0         25  79763.286423      79763.286423      79763.286423  90000.0
    1         37  45663.242276      45663.242276      45663.242276  37900.0
    2         41  59266.237561      59266.237561      59266.237561  57500.0
    3         45  61778.865480      61778.865480      61778.865480  51900.0
    4         53  76168.669602      76168.669602      76168.669602  87500.0
    5          7  70216.282691      70216.282691      70216.282691  70000.0
    6         49  70612.043027      70612.043027      70612.043027  78500.0
    7         29  37348.314071      37348.314071      37348.314071  40000.0
    8         21  37390.393707      37390.393707      37390.393707  44500.0
    9         13  49388.806579      49388.806579      49388.806579  44555.0
  10. Generate evaluation metrics on test dataset using second best performing model.
    >>> performance_metrics = aml.evaluate(housing_test, 2)
    2025-11-04 01:42:34,306 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 01:42:34,874 | INFO     | Following model is being picked for evaluation:
    2025-11-04 01:42:34,875 | INFO     | Model ID : XGBOOST_0
    2025-11-04 01:42:34,875 | INFO     | Feature Selection Method : rfe
    2025-11-04 01:42:36,338 | INFO     | Performance Metrics :
               MAE           MSE      MSLE      MAPE      MPE         RMSE    RMSLE            ME        R2        EV         MPD       MGD
    0  5540.218852  5.679481e+07  0.015997  9.576885  0.12904  7536.233439  0.12648  23471.971864  0.823611  0.829319  912.853432  0.016114
    >>> performance_metrics
               MAE           MSE      MSLE      MAPE      MPE         RMSE    RMSLE            ME        R2        EV         MPD       MGD
    0  5540.218852  5.679481e+07  0.015997  9.576885  0.12904  7536.233439  0.12648  23471.971864  0.823611  0.829319  912.853432  0.016114