AutoRegressor for regression with early stopping condition and customization - Example 2: Run AutoRegressor for Regression Problem with Early Stopping Condition and Customization - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2026-01-07
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

This example predicts the price of houses based on different factors.

Run AutoRegressor to get the best performing model with following specifications:

  • Set early stopping criteria, that is, time limit to 200 sec and performance metrics R2 threshold value to 0.6.
  • Exclude 'glm', 'svm', and 'knn' model from default model training list.
  • Opt for verbose level 2 to get detailed logging.
  • Use custom_config_file to customize some specific processes in AutoML flow.
  1. Load the example dataset.
    >>> load_example_data("decisionforestpredict", ["housing_train", "housing_test"])
    >>> housing_train = DataFrame.from_table("housing_train")
    >>> housing_test = DataFrame.from_table("housing_test")
  2. Generate custom config JSON file.
    >>> AutoRegressor.generate_custom_config("custom_housing")
    Generating custom config JSON for AutoML ...
    
    Available main options for customization with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Feature Engineering Phase
    
    Index 2: Customize Data Preparation Phase
    
    Index 3: Customize Model Training Phase
    
    Index 4: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the index you want to customize:  1
    
    Customizing Feature Engineering Phase ...
    
    Available options for customization of feature engineering phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Missing Value Handling
    
    Index 2: Customize Bincode Encoding
    
    Index 3: Customize String Manipulation
    
    Index 4: Customize Categorical Encoding
    
    Index 5: Customize Mathematical Transformation
    
    Index 6: Customize Nonlinear Transformation
    
    Index 7: Customize Antiselect Features
    
    Index 8: Back to main menu
    
    Index 9: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in feature engineering phase:  2,4,7,8
    
    Customizing Bincode Encoding ...
    
    Provide the following details to customize binning and coding encoding:
    
    Available binning methods with corresponding indices:
    Index 1: Equal-Width
    Index 2: Variable-Width
    
    Enter the feature or list of features for binning:  bedrooms
    
    Enter the index of corresponding binning method for feature bedrooms:  2
    
    Enter the number of bins for feature bedrooms:  2
    
    Available value type of feature for variable binning with corresponding indices:
    Index 1: int
    Index 2: float
    
    Provide the range for bin 1 of feature bedrooms: 
    
    Enter the index of corresponding value type of feature bedrooms:  1
    
    Enter the minimum value for bin 1 of feature bedrooms:  0
    
    Enter the maximum value for bin 1 of feature bedrooms:  2
    
    Enter the label for bin 1 of feature bedrooms:  small_house
    
    Provide the range for bin 2 of feature bedrooms: 
    
    Enter the index of corresponding value type of feature bedrooms:  1
    
    Enter the minimum value for bin 2 of feature bedrooms:  3
    
    Enter the maximum value for bin 2 of feature bedrooms:  6
    
    Enter the label for bin 2 of feature bedrooms:  big_house
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of bincode encoding has been completed successfully.
    
    Customizing Categorical Encoding ...
    
    Provide the following details to customize categorical encoding:
    
    Available categorical encoding methods with corresponding indices:
    Index 1: OneHotEncoding
    Index 2: OrdinalEncoding
    Index 3: TargetEncoding
    
    Enter the list of corresponding index categorical encoding methods you want to use:  2,3
    
    Enter the feature or list of features for OrdinalEncoding:  homestyle
    
    Enter the feature or list of features for TargetEncoding:  prefarea
    
    Available target encoding methods with corresponding indices:
    Index 1: CBM_BETA
    Index 2: CBM_DIRICHLET
    Index 3: CBM_GAUSSIAN_INVERSE_GAMMA
    
    Enter the index of target encoding method for feature prefarea:  3
    
    Enter the response column for target encoding method for feature prefarea:  price
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of categorical encoding has been completed successfully.
    
    Customizing Antiselect Features ...
    
    Enter the feature or list of features for antiselect:  sn
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of antiselect features has been completed successfully.
    
    Customization of feature engineering phase has been completed successfully.
    
    Available main options for customization with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Feature Engineering Phase
    
    Index 2: Customize Data Preparation Phase
    
    Index 3: Customize Model Training Phase
    
    Index 4: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the index you want to customize:  2
    
    Customizing Data Preparation Phase ...
    
    Available options for customization of data preparation phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Data Imbalance Handling
    
    Index 2: Customize Outlier Handling
    
    Index 3: Customize Feature Scaling
    
    Index 4: Back to main menu
    
    Index 5: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in data preparation phase:  1,2
    
    Customizing Data Imbalance Handling ...
    
    Available data sampling methods with corresponding indices:
    Index 1: SMOTE
    Index 2: NearMiss
    
    Enter the corresponding index data imbalance handling method:  1
    
    Customization of data imbalance handling has been completed successfully.
    
    Customizing Outlier Handling ...
    
    Available outlier detection methods with corresponding indices:
    Index 1: percentile
    Index 2: tukey
    Index 3: carling
    
    Enter the corresponding index oulier handling method:  1
    
    Enter the lower percentile value for outlier handling:  0.15
    
    Enter the upper percentile value for outlier handling:  0.85
    
    Enter the feature or list of features for outlier handling:  bathrms
    
    Available outlier replacement methods with corresponding indices:
    Index 1: delete
    Index 2: median
    Index 3: Any Numeric Value
    
    Enter the index of corresponding replacement method for feature bathrms:  1
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of outlier handling has been completed successfully.
    
    Available options for customization of data preparation phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Data Imbalance Handling
    
    Index 2: Customize Outlier Handling
    
    Index 3: Customize Feature Scaling
    
    Index 4: Back to main menu
    
    Index 5: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in data preparation phase:  5
    
    Customization of data preparation phase has been completed successfully.
    
    Process of generating custom config file for AutoML has been completed successfully.
    
    'custom_housing.json' file is generated successfully under the current working directory.
  3. Create an AutoRegressor instance.
    >>> aml = AutoRegressor(exclude=['glm','svm','knn'],
                            verbose=2,
                            max_runtime_secs=200,
                            stopping_metric='R2',
                            stopping_tolerance=0.6,
                            custom_config_file='custom_housing.json')
  4. Fit the data.
    >>> aml.fit(housing_train,housing_train.price)
    2025-11-04 01:22:22,305 | INFO     | Received below input for customization :
    {
        "BincodeIndicator": true,
        "BincodeParam": {
            "bedrooms": {
                "Type": "Variable-Width",
                "NumOfBins": 2,
                "Bin_1": {
                    "min_value": 0,
                    "max_value": 2,
                    "label": "small_house"
                },
                "Bin_2": {
                    "min_value": 3,
                    "max_value": 6,
                    "label": "big_house"
                }
            }
        },
        "CategoricalEncodingIndicator": true,
        "CategoricalEncodingParam": {
            "OrdinalEncodingIndicator": true,
            "OrdinalEncodingList": [
                "homestyle"
            ],
            "TargetEncodingIndicator": true,
            "TargetEncodingList": {
                "prefarea": {
                    "encoder_method": "CBM_GAUSSIAN_INVERSE_GAMMA",
                    "response_column": "price"
                }
            }
        },
        "AntiselectIndicator": true,
        "AntiselectParam": {
            "excluded_columns": [
                "sn"
            ]
        },
        "DataImbalanceIndicator": true,
        "DataImbalanceMethod": "SMOTE",
        "OutlierFilterIndicator": true,
        "OutlierFilterMethod": "percentile",
        "OutlierLowerPercentile": 0.15,
        "OutlierUpperPercentile": 0.85,
        "OutlierFilterParam": {
            "bathrms": {
                "replacement_value": "delete"
            }
        }
    }
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:22:22,306 | INFO     | Feature Exploration started
    2025-11-04 01:22:22,306 | INFO     | Data Overview:
    2025-11-04 01:22:22,327 | INFO     | Total Rows in the data: 492
    2025-11-04 01:22:22,349 | INFO     | Total Columns in the data: 14
    2025-11-04 01:22:22,944 | INFO     | Column Summary:
       ColumnName                         Datatype  NonNullCount  NullCount  BlankCount  ZeroCount  PositiveCount  NegativeCount  NullPercentage  NonNullPercentage
    0    bedrooms                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    1    fullbase  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    2       gashw  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    3       airco  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    4   homestyle  VARCHAR(20) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    5          sn                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    6    driveway  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    7     bathrms                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    8     recroom  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    9    prefarea  VARCHAR(10) CHARACTER SET LATIN           492          0         0.0        NaN            NaN            NaN             0.0              100.0
    10   garagepl                          INTEGER           492          0         NaN      270.0          222.0            0.0             0.0              100.0
    11    stories                          INTEGER           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    12    lotsize                            FLOAT           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    13      price                            FLOAT           492          0         NaN        0.0          492.0            0.0             0.0              100.0
    2025-11-04 01:22:24,406 | INFO     | Statistics of Data:
      ATTRIBUTE StatName  StatValue
    0  garagepl  MAXIMUM        3.0
    1  bedrooms  MINIMUM        1.0
    2  bedrooms  MAXIMUM        6.0
    3        sn    COUNT      492.0
    4        sn  MAXIMUM      546.0
    5   bathrms    COUNT      492.0
    6   bathrms  MINIMUM        1.0
    7   bathrms  MAXIMUM        4.0
    8        sn  MINIMUM        1.0
    9  bedrooms    COUNT      492.0
    2025-11-04 01:22:24,555 | INFO     | Categorical Columns with their Distinct values:
    ColumnName                DistinctValueCount
    driveway                  2
    recroom                   2
    fullbase                  2
    gashw                     2
    airco                     2
    prefarea                  2
    homestyle                 3
    2025-11-04 01:22:26,898 | INFO     | No Futile columns found.
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 01:22:29,238 | INFO     | Columns with outlier percentage :-
      ColumnName  OutlierPercentage
    0   bedrooms           2.235772
    1   garagepl           2.235772
    2    bathrms           0.203252
    3    lotsize           2.235772
    4    stories           7.113821
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:22:29,466 | INFO     | Feature Engineering started ...
    2025-11-04 01:22:29,466 | INFO     | Handling duplicate records present in dataset ...
    2025-11-04 01:22:29,583 | INFO     | Analysis completed. No action taken.
    2025-11-04 01:22:29,583 | INFO     | Total time to handle duplicate records: 0.12 sec
    2025-11-04 01:22:29,584 | INFO     | Starting customized anti-select columns ...
    2025-11-04 01:22:29,950 | INFO     | Updated dataset sample after performing anti-select columns:
          price  lotsize  bedrooms  bathrms  stories driveway recroom fullbase gashw airco  garagepl prefarea homestyle
    0  120000.0   5500.0         4        2        2      yes      no      yes    no   yes         1      yes  bungalow
    1   99000.0   8880.0         3        2        2      yes      no      yes    no   yes         1       no  Eclectic
    2   58000.0   4340.0         3        1        1      yes      no       no    no    no         0       no  Eclectic
    3   50000.0   3640.0         2        1        1      yes      no       no    no    no         1       no   Classic
    4   60000.0   5800.0         3        1        1      yes      no       no   yes    no         2       no  Eclectic
    5   54500.0   3150.0         2        2        1       no      no      yes    no    no         0       no  Eclectic
    6   70100.0   4200.0         3        1        2      yes      no       no    no    no         1       no  Eclectic
    7   44100.0   8100.0         2        1        1      yes      no       no    no    no         1       no   Classic
    8   27000.0   3649.0         2        1        1      yes      no       no    no    no         0       no   Classic
    9   48000.0   4120.0         2        1        2      yes      no       no    no    no         0       no   Classic
    492 rows X 13 columns
    2025-11-04 01:22:30,305 | INFO     | Handling less significant features from data ...
    2025-11-04 01:22:34,541 | INFO     | Analysis indicates all categorical columns are significant. No action Needed.
    2025-11-04 01:22:34,541 | INFO     | Total time to handle less significant features: 4.24 sec
    2025-11-04 01:22:34,541 | INFO     | Handling Date Features ...
    2025-11-04 01:22:34,541 | INFO     | Analysis Completed. Dataset does not contain any feature related to dates. No action needed.
    2025-11-04 01:22:34,541 | INFO     | Total time to handle date features: 0.00 sec
    2025-11-04 01:22:34,541 | INFO     | Proceeding with default option for missing value imputation.
    2025-11-04 01:22:34,541 | INFO     | Proceeding with default option for handling remaining missing values.
    2025-11-04 01:22:34,541 | INFO     | Checking Missing values in dataset ...
    2025-11-04 01:22:36,626 | INFO     | Analysis Completed. No Missing Values Detected.
    2025-11-04 01:22:36,626 | INFO     | Total time to find missing values in data: 2.09 sec
    2025-11-04 01:22:36,627 | INFO     | Imputing Missing Values ...
    2025-11-04 01:22:36,627 | INFO     | Analysis completed. No imputation required.
    2025-11-04 01:22:36,627 | INFO     | Time taken to perform imputation: 0.00 sec
    2025-11-04 01:22:36,627 | INFO     | No information provided for Equal-Width Transformation.
    2025-11-04 01:22:36,628 | INFO     | Variable-Width binning information:-
      ColumnName  MinValue  MaxValue        Label
    0   bedrooms         0         2  small_house
    1   bedrooms         3         6    big_house
    2 rows X 4 columns
    2025-11-04 01:22:39,422 | INFO     | Updated dataset sample after performing Variable-Width binning:
            driveway airco  garagepl  bathrms homestyle  lotsize prefarea     price  automl_id gashw fullbase recroom     bedrooms
    stories
    3            yes   yes         2        1  Eclectic   3000.0       no   58000.0        290    no      yes      no  big_house
    3            yes   yes         2        2  bungalow   6360.0      yes  100500.0        179    no       no      no  big_house
    3            yes   yes         0        2  Eclectic   6600.0      yes   89900.0        470    no       no      no  big_house
    3            yes   yes         2        2  bungalow   7420.0      yes  190000.0        376    no       no      no  big_house
    3            yes    no         1        1  Eclectic   5500.0      yes   89000.0        119    no       no      no  big_house
    3            yes    no         0        1  Eclectic   4079.0       no   60000.0        455    no       no      no  big_house
    3            yes   yes         0        1  Eclectic   5000.0       no   82000.0        199    no       no      no  big_house
    3            yes   yes         0        1  Eclectic   2275.0      yes   52000.0        273   yes       no      no  big_house
    3            yes    no         0        1   Classic   5200.0       no   40750.0        272    no       no      no  big_house
    3            yes    no         0        1   Classic   2145.0      yes   49500.0         55    no       no      no  big_house
    492 rows X 14 columns
    2025-11-04 01:22:39,537 | INFO     | Skipping customized string manipulation.
    2025-11-04 01:22:39,537 | INFO     | Starting Customized Categorical Feature Encoding ...
    2025-11-04 01:22:41,649 | INFO     | Updated dataset sample after performing ordinal encoding:
           stories driveway  garagepl  bathrms  lotsize prefarea     price  automl_id gashw     bedrooms fullbase recroom  homestyle
    airco
    yes          2      yes         1        1   4785.0       no   48500.0        475    no  big_house      yes     yes          1
    yes          3      yes         0        2   6350.0       no   88500.0         53    no  big_house       no     yes          2
    yes          1      yes         0        1   5800.0       no   70000.0        439    no  small_house      yes     yes          2
    yes          3      yes         0        1   5200.0       no   83000.0         69    no  big_house       no      no          2
    yes          2      yes         2        1   3500.0       no   48000.0         93    no  big_house       no      no          1
    yes          4      yes         0        1   4500.0       no   88000.0         22    no  big_house       no      no          2
    yes          4      yes         2        2   7475.0       no  120000.0         56    no  big_house       no      no          0
    yes          3      yes         2        2   7420.0      yes  190000.0        376    no  big_house       no      no          0
    yes          3      yes         2        2   4100.0       no   90000.0        294    no  big_house       no      no          2
    yes          4      yes         3        1   6600.0      yes  107000.0         90    no  big_house       no      no          0
    492 rows X 14 columns
    2025-11-04 01:22:44,741 | INFO     | Updated dataset sample after performing target encoding:
                 airco driveway  stories  garagepl  bathrms  homestyle  lotsize     price  automl_id gashw     bedrooms fullbase recroom
    prefarea
    62906.335979   yes      yes        4         2        2          0   7475.0  120000.0         56    no  big_house       no      no
    62906.335979   yes      yes        4         1        2          0   6000.0  107500.0        268    no  big_house       no      no
    62906.335979   yes      yes        2         0        1          2   5040.0   60000.0        418    no  big_house      yes      no
    62906.335979   yes      yes        2         0        1          2   3000.0   52000.0         36    no  small_house       no      no
    62906.335979   yes       no        2         0        1          2   4095.0   70000.0        143    no  big_house      yes     yes
    62906.335979   yes      yes        1         0        1          1   4960.0   44000.0        145    no  small_house      yes      no
    83851.724138   yes      yes        2         0        1          0   6550.0  112500.0        322    no  big_house      yes      no
    83851.724138   yes      yes        1         0        1          2   3600.0   58550.0        213    no  big_house      yes      no
    83851.724138   yes      yes        2         0        1          2   5136.0   80000.0        309    no  big_house      yes     yes
    83851.724138   yes      yes        2         1        2          0   6600.0  130000.0        328    no  big_house      yes     yes
    492 rows X 14 columns
    2025-11-04 01:22:44,916 | INFO     | Performing encoding for categorical columns ...
    2025-11-04 01:22:47,436 | INFO     | ONE HOT Encoding these Columns:
    ['airco', 'driveway', 'gashw', 'bedrooms', 'fullbase', 'recroom']
    2025-11-04 01:22:47,437 | INFO     | Sample of dataset after performing one hot encoding:
                  airco_0  airco_1  driveway_0  driveway_1  stories  garagepl  bathrms  homestyle  lotsize     price  automl_id  gashw_0  gashw_1  bedrooms_0  bedrooms_1  fullbase_0  fullbase_1  recroom_0  recroom_1
    prefarea                                                                                                                                                     
    62906.335979        0        1           0           1        2         0        1          2   5040.0   60000.0        418        1        0           1           0           0           1          1          0
    62906.335979        0        1           1           0        2         0        1          2   4095.0   70000.0        143        1        0           1           0           0           1          0          1
    62906.335979        0        1           0           1        1         0        1          1   4960.0   44000.0        145        1        0           0           1           0           1          1          0
    62906.335979        0        1           0           1        2         0        1          2   7152.0   55000.0        243        1        0           1           0           1           0          1          0
    62906.335979        0        1           0           1        2         2        2          0   4600.0  127000.0        416        1        0           1           0           1           0          0          1
    62906.335979        0        1           0           1        2         1        2          2   8880.0   99000.0          8        1        0           1           0           0           1          1          0
    83851.724138        0        1           0           1        2         0        1          2   5136.0   80000.0        309        1        0           1           0           0           1          0          1
    83851.724138        0        1           0           1        3         0        2          2   6500.0   95000.0        435        1        0           1           0           1           0          1          0
    83851.724138        0        1           0           1        2         3        2          0   7500.0  174500.0        116        1        0           1           0           0           1          1          0
    83851.724138        0        1           0           1        1         0        1          2   6420.0   85000.0        427        1        0           1           0           0           1          1          0
    492 rows X 20 columns
    2025-11-04 01:22:47,530 | INFO     | Time taken to encode the columns: 2.61 sec
    2025-11-04 01:22:47,530 | INFO     | Starting customized mathematical transformation ...
    2025-11-04 01:22:47,530 | INFO     | Skipping customized mathematical transformation.
    2025-11-04 01:22:47,530 | INFO     | Starting customized non-linear transformation ...
    2025-11-04 01:22:47,531 | INFO     | Skipping customized non-linear transformation.
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:22:47,531 | INFO     | Data preparation started ...
    2025-11-04 01:22:47,531 | INFO     | No information provided for performing customized feature scaling. Proceeding with default option.
    2025-11-04 01:22:47,532 | INFO     | Starting customized outlier processing ...
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 01:22:50,825 | INFO     | Columns with outlier percentage :-
      ColumnName  OutlierPercentage
    0    lotsize           9.552846
    1  automl_id           9.756098
    2    bathrms           2.235772
    3   garagepl           2.235772
    2025-11-04 01:22:53,373 | INFO     | Sample of dataset after performing custom outlier filtering
                  airco_0  airco_1  driveway_0  driveway_1  stories  garagepl  bathrms  homestyle  lotsize     price  automl_id  gashw_0  gashw_1  bedrooms_0  bedrooms_1  fullbase_0  fullbase_1  recroom_0  recroom_1
    prefarea                                                                                                                                                     
    62906.335979        0        1           0           1        1         0        1          1   4960.0   44000.0        145        1        0           0           1           0           1          1          0
    62906.335979        0        1           0           1        2         2        2          0   4600.0  127000.0        416        1        0           1           0           1           0          0          1
    62906.335979        0        1           0           1        2         1        2          2   8880.0   99000.0          8        1        0           1           0           0           1          1          0
    62906.335979        0        1           0           1        2         1        2          0   4560.0  123500.0        171        1        0           1           0           0           1          0          1
    62906.335979        0        1           0           1        1         1        1          2   6000.0   98000.0         31        1        0           1           0           1           0          1          0
    62906.335979        0        1           0           1        1         1        1          1   2684.0   46000.0        183        1        0           0           1           1           0          1          0
    83851.724138        0        1           0           1        2         3        2          0   7500.0  174500.0        116        1        0           1           0           0           1          1          0
    83851.724138        0        1           0           1        4         1        2          0   9000.0  103500.0         88        1        0           1           0           1           0          0          1
    83851.724138        0        1           0           1        1         0        1          2   4815.0   69000.0        450        1        0           0           1           1           0          1          0
    83851.724138        0        1           0           1        1         1        2          2  10500.0   94500.0        189        1        0           1           0           0           1          1          0
    481 rows X 20 columns
    2025-11-04 01:22:54,307 | INFO     | Feature selection using rfe ...
    2025-11-04 01:23:29,352 | INFO     | feature selected by RFE:
    ['stories', 'bathrms', 'bedrooms_0', 'airco_0', 'garagepl', 'homestyle', 'fullbase_0', 'prefarea', 'lotsize']
    2025-11-04 01:23:29,354 | INFO     | Total time taken by feature selection: 35.01 sec
    2025-11-04 01:23:29,662 | INFO     | Scaling Features of rfe data ...
    2025-11-04 01:23:31,115 | INFO     | columns that will be scaled:
    ['r_stories', 'r_bathrms', 'r_garagepl', 'r_homestyle', 'r_prefarea', 'r_lotsize']
    2025-11-04 01:23:33,186 | INFO     | Dataset sample after scaling:
       r_bedrooms_0  automl_id    price  r_fullbase_0  r_airco_0  r_stories  r_bathrms  r_garagepl  r_homestyle  r_prefarea  r_lotsize
    0             0          6  54500.0             0          1  -0.924349   1.724879   -0.796258     0.734297   -0.554135  -0.940560
    1             1          8  99000.0             0          0   0.242611   1.724879    0.389501     0.734297   -0.554135   1.763806
    2             0          9  27000.0             1          1  -0.924349  -0.579751   -0.796258    -0.743515   -0.554135  -0.705049
    3             1         10  70100.0             1          1   0.242611  -0.579751    0.389501     0.734297   -0.554135  -0.444995
    4             1         12  58000.0             1          1  -0.924349  -0.579751   -0.796258     0.734297   -0.554135  -0.378920
    5             1         13  60000.0             1          1  -0.924349  -0.579751    1.575259     0.734297   -0.554135   0.310150
    6             1         11  83900.0             1          1   1.409572  -0.579751    1.575259     0.734297    1.804616   2.981478
    7             1          7  80000.0             1          1   0.242611   1.724879    0.389501     0.734297   -0.554135   2.528391
    8             0          5  50000.0             1          1  -0.924349  -0.579751    0.389501    -0.743515   -0.554135  -0.709296
    9             0          4  48000.0             1          1   0.242611  -0.579751   -0.796258    -0.743515   -0.554135  -0.482753
    481 rows X 11 columns
    2025-11-04 01:23:33,772 | INFO     | Total time taken by feature scaling: 4.11 sec
    2025-11-04 01:23:33,773 | INFO     | Scaling Features of pca data ...
    2025-11-04 01:23:35,626 | INFO     | columns that will be scaled:
    ['prefarea', 'stories', 'garagepl', 'bathrms', 'homestyle', 'lotsize']
    2025-11-04 01:23:37,813 | INFO     | Dataset sample after scaling:
       fullbase_1  bedrooms_1  driveway_0  airco_0  recroom_1  airco_1  gashw_0     price  automl_id  gashw_1  fullbase_0  driveway_1  recroom_0  bedrooms_0  prefarea   stories  garagepl   bathrms  homestyle   lotsize
    0           1           1           0        0          0        1        1   44000.0        145        0           0           1          1           0 -0.554135 -0.924349 -0.796258 -0.579751  -0.743515 -0.086301
    1           0           0           0        0          1        1        1  127000.0        416        0           1           1          0           1 -0.554135  0.242611  1.575259  1.724879  -2.221327 -0.256209
    2           1           0           0        0          0        1        1   99000.0          8        0           0           1          1           1 -0.554135  0.242611  0.389501  1.724879   0.734297  1.763806
    3           1           0           0        0          1        1        1  123500.0        171        0           0           1          0           1 -0.554135  0.242611  0.389501  1.724879  -2.221327 -0.275087
    4           0           0           0        0          0        1        1   98000.0         31        0           1           1          1           1 -0.554135 -0.924349  0.389501 -0.579751   0.734297  0.404544
    5           0           1           0        0          0        1        1   46000.0        183        0           1           1          1           0 -0.554135 -0.924349  0.389501 -0.579751  -0.743515 -1.160496
    6           1           0           0        0          0        1        1  174500.0        116        0           0           1          1           1  1.804616  0.242611  2.761017  1.724879  -2.221327  1.112493
    7           0           0           0        0          1        1        1  103500.0         88        0           1           1          0           1  1.804616  2.576532  0.389501  1.724879  -2.221327  1.820442
    8           0           1           0        0          0        1        1   69000.0        450        0           1           1          1           0  1.804616 -0.924349 -0.796258 -0.579751   0.734297 -0.154736
    9           1           0           0        0          0        1        1   94500.0        189        0           0           1          1           1  1.804616 -0.924349  0.389501  1.724879   0.734297  2.528391
    481 rows X 20 columns
    2025-11-04 01:23:38,447 | INFO     | Total time taken by feature scaling: 4.67 sec
    2025-11-04 01:23:38,448 | INFO     | Dimension Reduction using pca ...
    2025-11-04 01:23:39,107 | INFO     | PCA columns:
    ['col_0', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5', 'col_6', 'col_7', 'col_8', 'col_9']
    2025-11-04 01:23:39,108 | INFO     | Total time taken by PCA: 0.66 sec
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:23:39,514 | INFO     | Model Training started ...
    2025-11-04 01:23:39,558 | INFO     | Starting customized hyperparameter update ...
    2025-11-04 01:23:39,558 | INFO     | Skipping customized hyperparameter tuning
    2025-11-04 01:23:39,559 | INFO     | Hyperparameters used for model training:
    2025-11-04 01:23:39,559 | INFO     | Model: decision_forest
    2025-11-04 01:23:39,559 | INFO     | Hyperparameters: {'response_column': 'price', 'name': 'decision_forest', 'tree_type': 'Regression', 'min_impurity': (0.0, 0.1, 0.2, 0.3), 'max_depth': (5, 3, 4, 7, 8), 'min_node_size': (1, 2, 3, 4), 'num_trees': (-1,), 'seed': 42}
    2025-11-04 01:23:39,559 | INFO     | Total number of models for decision_forest: 80
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    2025-11-04 01:23:39,560 | INFO     | Model: xgboost
    2025-11-04 01:23:39,560 | INFO     | Hyperparameters: {'response_column': 'price', 'name': 'xgboost', 'model_type': 'Regression', 'column_sampling': (1, 0.6), 'min_impurity': (0.0, 0.1, 0.2, 0.3), 'lambda1': (1.0, 1.0, 10.0, 100.0), 'shrinkage_factor': (0.5, 0.01, 0.05, 0.1), 'max_depth': (5, 3, 4, 7, 8), 'min_node_size': (1, 2, 3, 4), 'iter_num': (10, 20, 30, 40), 'num_boosted_trees': (-1, 20, 50, 100), 'seed': 42}
    2025-11-04 01:23:39,720 | INFO     | Total number of models for xgboost: 40960
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    2025-11-04 01:23:39,721 | INFO     | Performing hyperparameter tuning ...
                                                                                                                                                                 2025-11-04 01:23:40,813 | INFO     | Model training for decision_forest
    2025-11-04 01:23:57,962 | INFO     | ----------------------------------------------------------------------------------------------------
                                                                                                                                                                 2025-11-04 01:23:57,962 | INFO     | Model training for xgboost
    2025-11-04 01:24:17,095 | INFO     | ----------------------------------------------------------------------------------------------------
    2025-11-04 01:24:17,097 | INFO     | Leaderboard
       RANK          MODEL_ID FEATURE_SELECTION           MAE           MSE      MSLE  ...            ME        R2        EV          MPD       MGD  ADJUSTED_R2
    0     1         XGBOOST_0               rfe   7711.646791  1.006034e+08  0.024802  ...  32571.392060  0.868742  0.871048  1386.666530  0.023267     0.866233
    1     2  DECISIONFOREST_2               rfe   8570.760495  1.165783e+08  0.025961  ...  34000.000000  0.847899  0.848303  1556.243777  0.025133     0.844992
    2     3         XGBOOST_1               pca   8397.288359  1.461547e+08  0.033499  ...  53787.874836  0.809310  0.812617  1957.464971  0.030490     0.805253
    3     4  DECISIONFOREST_0               rfe   9012.190907  1.518862e+08  0.027755  ...  55000.000000  0.801832  0.802970  1807.795304  0.026938     0.798046
    4     5         XGBOOST_2               rfe  10662.822187  2.708815e+08  0.040123  ...  82415.914398  0.646578  0.652841  3060.543638  0.040770     0.639824
    5     6  DECISIONFOREST_3               pca  10714.325844  2.792791e+08  0.048732  ...  72750.000000  0.635621  0.636265  3329.871923  0.045563     0.627868
    6     7         XGBOOST_3               pca  11629.208492  3.193868e+08  0.047100  ...  81429.175364  0.583292  0.600407  3646.398180  0.048086     0.574426
    7     8  DECISIONFOREST_1               pca  12298.340849  3.915081e+08  0.064405  ...  72750.000000  0.489194  0.494876  4503.740377  0.058413     0.478326
    8     9         XGBOOST_4               rfe  14526.292231  4.623473e+08  0.077264  ...  94901.156086  0.396770  0.401454  5659.836390  0.078339     0.385243
    [9 rows x 16 columns]
    9 rows X 16 columns
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    >>> Completed: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿| 100% - 18/18
  5. Display model leaderboard.
    >>> aml.leaderboard()
       RANK          MODEL_ID FEATURE_SELECTION           MAE           MSE      MSLE  ...            ME        R2        EV          MPD       MGD  ADJUSTED_R2
    0     1         XGBOOST_0               rfe   7711.646791  1.006034e+08  0.024802  ...  32571.392060  0.868742  0.871048  1386.666530  0.023267     0.866233
    1     2  DECISIONFOREST_2               rfe   8570.760495  1.165783e+08  0.025961  ...  34000.000000  0.847899  0.848303  1556.243777  0.025133     0.844992
    2     3         XGBOOST_1               pca   8397.288359  1.461547e+08  0.033499  ...  53787.874836  0.809310  0.812617  1957.464971  0.030490     0.805253
    3     4  DECISIONFOREST_0               rfe   9012.190907  1.518862e+08  0.027755  ...  55000.000000  0.801832  0.802970  1807.795304  0.026938     0.798046
    4     5         XGBOOST_2               rfe  10662.822187  2.708815e+08  0.040123  ...  82415.914398  0.646578  0.652841  3060.543638  0.040770     0.639824
    5     6  DECISIONFOREST_3               pca  10714.325844  2.792791e+08  0.048732  ...  72750.000000  0.635621  0.636265  3329.871923  0.045563     0.627868
    6     7         XGBOOST_3               pca  11629.208492  3.193868e+08  0.047100  ...  81429.175364  0.583292  0.600407  3646.398180  0.048086     0.574426
    7     8  DECISIONFOREST_1               pca  12298.340849  3.915081e+08  0.064405  ...  72750.000000  0.489194  0.494876  4503.740377  0.058413     0.478326
    8     9         XGBOOST_4               rfe  14526.292231  4.623473e+08  0.077264  ...  94901.156086  0.396770  0.401454  5659.836390  0.078339     0.385243
    
  6. Display the best performing model.
    >>> aml.leader()
       RANK   MODEL_ID FEATURE_SELECTION          MAE           MSE      MSLE  ...           ME        R2        EV         MPD       MGD  ADJUSTED_R2
    0     1  XGBOOST_0               rfe  7711.646791  1.006034e+08  0.024802  ...  32571.39206  0.868742  0.871048  1386.66653  0.023267     0.866233
  7. Display hyperparameters for trained model.
    1. Display model hyperparameters for rank 2.
      >>> aml.model_hyperparameters(rank=2)
      {'response_column': 'price', 
        'name': 'decision_forest', 
        'tree_type': 'Regression', 
        'min_impurity': 0.0, 
        'max_depth': 5, 
        'min_node_size': 2, 
        'num_trees': -1, 
        'seed': 42, 
        'persist': False}
      
    2. Display model hyperparameters for rank 3.
      >>> aml.model_hyperparameters(rank=3)
      {'response_column': 'price', 
        'name': 'xgboost', 
        'model_type': 'Regression', 
        'column_sampling': 1, 
        'min_impurity': 0.0, 
        'lambda1': 1.0, 
        'shrinkage_factor': 0.5, 
        'max_depth': 5, 
        'min_node_size': 1, 
        'iter_num': 10, 
        'num_boosted_trees': -1, 
        'seed': 42, 
        'persist': False}
      
  8. Generate prediction on test dataset using best performing model.
    >>> prediction = aml.predict(housing_test)
    2025-11-04 01:28:14,601 | INFO     | Data Transformation started ...
    2025-11-04 01:28:14,601 | INFO     | Performing transformation carried out in feature engineering phase ...
    2025-11-04 01:28:16,160 | INFO     | Updated dataset after performing customized variable width bin-code transformation :
            driveway airco  garagepl  bathrms homestyle  lotsize prefarea    price  automl_id gashw fullbase recroom   sn     bedrooms
    stories
    3            yes    no         0        1  Eclectic   6420.0      yes  87500.0         53    no      yes      no  408  big_house
    1            yes   yes         1        1  Eclectic   5885.0       no  64000.0         28    no       no      no  306  small_house
    1            yes   yes         0        1  Eclectic   6825.0      yes  77500.0         32    no      yes     yes  403  big_house
    1            yes    no         0        2  Eclectic   4100.0       no  64900.0         24    no      yes     yes  274  small_house
    1            yes    no         2        1   Classic   3450.0       no  48500.0         17    no      yes      no  251  big_house
    1            yes    no         2        1  Eclectic   3520.0      yes  51900.0         45    no       no      no  441  big_house
    1            yes    no         0        1   Classic   6000.0       no  41000.0         14    no       no      no  260  small_house
    1            yes    no         2        1  Eclectic   7980.0       no  78500.0         49    no       no      no  353  big_house
    1            yes    no         0        1  Eclectic   2787.0      yes  60500.0         19    no      yes      no  472  big_house
    1            yes    no         1        1  Eclectic   9000.0      yes  90000.0         25    no      yes      no  411  big_house
    46 rows X 15 columns
    2025-11-04 01:28:17,829 | INFO     | Updated dataset after performing customized categorical encoding :
                 airco driveway  stories  garagepl  bathrms  homestyle  lotsize    price  automl_id gashw     bedrooms fullbase recroom   sn
    prefarea
    62906.335979   yes      yes        2         1        1          2   3162.0  63900.0         33    no  big_house       no      no  161
    62906.335979    no      yes        1         0        1          2   4080.0  55000.0          5    no  small_house       no      no  301
    62906.335979    no      yes        1         2        1          1   3450.0  48500.0         17    no  big_house      yes      no  251
    62906.335979    no      yes        1         2        1          1   3000.0  26000.0         39    no  small_house      yes      no  239
    62906.335979    no       no        1         0        1          1   5076.0  43000.0         40    no  big_house       no      no  111
    62906.335979    no       no        1         0        1          1   3970.0  32500.0         20    no  small_house       no      no  234
    83851.724138   yes      yes        1         0        1          2   6825.0  77500.0         32    no  big_house      yes     yes  403
    83851.724138    no      yes        2         0        1          2   2176.0  55000.0          4    no  small_house       no     yes  469
    83851.724138    no      yes        2         0        1          1   2610.0  49000.0          6    no  big_house      yes      no  463
    83851.724138    no      yes        1         0        1          1   2398.0  44555.0         13    no  big_house       no      no  459
    46 rows X 15 columns
    2025-11-04 01:28:19,042 | INFO     | Updated dataset after performing categorical encoding :
                  airco_0  airco_1  driveway_0  driveway_1  stories  garagepl  bathrms  homestyle  lotsize    price  automl_id  gashw_0  gashw_1  bedrooms_0  bedrooms_1  fullbase_0  fullbase_1  recroom_0  recroom_1   sn
    prefarea                                                                                                                                                     
    62906.335979        1        0           0           1        1         2        1          1   3450.0  48500.0         17        1        0           1           0           0           1          1          0  251
    62906.335979        1        0           1           0        1         0        1          1   5076.0  43000.0         40        1        0           1           0           1           0          1          0  111
    62906.335979        1        0           1           0        1         0        1          1   3970.0  32500.0         20        1        0           0           1           1           0          1          0  234
    62906.335979        1        0           0           1        4         0        1          2   5000.0  80000.0         51        1        0           1           0           1           0          1          0  317
    62906.335979        1        0           0           1        2         1        1          1   2650.0  40000.0         29        1        0           1           0           0           1          1          0  142
    62906.335979        1        0           0           1        2         0        1          2   4360.0  61000.0         10        1        0           1           0           1           0          1          0  255
    83851.724138        1        0           0           1        2         0        1          1   2610.0  49000.0          6        1        0           1           0           0           1          1          0  463
    83851.724138        1        0           0           1        1         2        1          2   3520.0  51900.0         45        1        0           1           0           1           0          1          0  441
    83851.724138        1        0           0           1        1         1        1          2   9000.0  90000.0         25        1        0           1           0           0           1          1          0  411
    83851.724138        1        0           0           1        3         0        1          2   6420.0  87500.0         53        1        0           1           0           0           1          1          0  408
    46 rows X 21 columns
    2025-11-04 01:28:19,809 | INFO     | Updated dataset after performing customized anti-selection :
           prefarea  airco_0  airco_1  driveway_0  driveway_1  stories  garagepl  bathrms  homestyle  lotsize    price  automl_id  gashw_0  gashw_1  bedrooms_0  bedrooms_1  fullbase_0  fullbase_1  recroom_0  recroom_1
    0  83851.724138        1        0           0           1        1         1        1          2   9000.0  90000.0         25        1        0           1           0           0           1          1          0
    1  83851.724138        1        0           0           1        1         0        1          2   2787.0  60500.0         19        1        0           1           0           0           1          1          0
    2  83851.724138        0        1           0           1        2         2        1          2   6862.0  69000.0         15        1        0           1           0           1           0          1          0
    3  83851.724138        0        1           0           1        1         2        1          2   7410.0  92500.0         36        1        0           1           0           0           1          0          1
    4  83851.724138        1        0           0           1        1         0        1          1   2398.0  44555.0         13        1        0           1           0           1           0          1          0
    5  83851.724138        1        0           0           1        2         0        1          2   2176.0  55000.0          4        1        0           0           1           1           0          0          1
    6  62906.335979        0        1           0           1        2         1        1          2   3162.0  63900.0         33        1        0           1           0           1           0          1          0
    7  62906.335979        1        0           0           1        1         0        1          2   4080.0  55000.0          5        1        0           0           1           1           0          1          0
    8  62906.335979        1        0           0           1        1         2        1          1   3450.0  48500.0         17        1        0           1           0           0           1          1          0
    9  62906.335979        1        0           0           1        1         2        1          1   3000.0  26000.0         39        1        0           0           1           0           1          1          0
    46 rows X 20 columns
    2025-11-04 01:28:20,104 | INFO     | Performing transformation carried out in data preparation phase ...
    2025-11-04 01:28:20,933 | INFO     | Updated dataset after performing RFE feature selection:
       automl_id  stories  bathrms  bedrooms_0  airco_0  garagepl  homestyle  fullbase_0   prefarea  lotsize    price
    0         29        2        1           1        1         1          1           0  62906.336   2650.0  40000.0
    1         23        2        2           1        1         1          2           0  62906.336   2817.0  78500.0
    2         34        2        2           1        1         0          2           1  62906.336   4300.0  86900.0
    3         27        2        1           1        1         0          1           1  62906.336   3750.0  43000.0
    4          7        2        1           1        1         0          2           1  62906.336   5400.0  70000.0
    5         12        2        1           1        1         0          2           0  62906.336  10700.0  72000.0
    6         20        1        1           0        1         0          1           1  62906.336   3970.0  32500.0
    7         24        1        2           0        1         0          2           0  62906.336   4100.0  64900.0
    8         59        1        1           0        1         0          1           1  62906.336   4040.0  47000.0
    9         21        1        1           0        1         0          1           1  62906.336   3500.0  44500.0
    46 rows X 11 columns
    2025-11-04 01:28:21,773 | INFO     | Updated dataset after performing scaling on RFE selected features :
       r_bedrooms_0  automl_id    price  r_fullbase_0  r_airco_0  r_stories  r_bathrms  r_garagepl  r_homestyle  r_prefarea  r_lotsize
    0             1         29  40000.0             0          1   0.242611  -0.579751    0.389501    -0.743515   -0.554135  -1.176543
    1             1         23  78500.0             0          1   0.242611   1.724879    0.389501     0.734297   -0.554135  -1.097724
    2             1         34  86900.0             1          1   0.242611   1.724879   -0.796258     0.734297   -0.554135  -0.397799
    3             1         27  43000.0             1          1   0.242611  -0.579751   -0.796258    -0.743515   -0.554135  -0.657380
    4             1          7  70000.0             1          1   0.242611  -0.579751   -0.796258     0.734297   -0.554135   0.121364
    5             1         12  72000.0             0          1   0.242611  -0.579751   -0.796258     0.734297   -0.554135   2.622784
    6             0         20  32500.0             1          1  -0.924349  -0.579751   -0.796258    -0.743515   -0.554135  -0.553547
    7             0         24  64900.0             0          1  -0.924349   1.724879   -0.796258     0.734297   -0.554135  -0.492192
    8             0         59  47000.0             1          1  -0.924349  -0.579751   -0.796258    -0.743515   -0.554135  -0.520510
    9             0         21  44500.0             1          1  -0.924349  -0.579751   -0.796258    -0.743515   -0.554135  -0.775371
    46 rows X 11 columns
    2025-11-04 01:28:23,133 | INFO     | Updated dataset after performing scaling for PCA feature selection :
       fullbase_1  bedrooms_1  driveway_0  airco_0  recroom_1  airco_1  gashw_0    price  automl_id  gashw_1  fullbase_0  driveway_1  recroom_0  bedrooms_0  prefarea   stories  garagepl   bathrms  homestyle   lotsize
    0           0           1           0        1          0        0        1  47000.0         59        0           1           1          1           0 -0.554135 -0.924349 -0.796258 -0.579751  -0.743515 -0.520510
    1           0           1           0        0          0        1        1  37900.0         37        0           1           1          1           0 -0.554135 -0.924349 -0.796258 -0.579751  -0.743515 -0.924041
    2           1           1           0        0          0        1        1  68000.0         11        0           0           1          1           0 -0.554135 -0.924349  1.575259 -0.579751   0.734297  1.898788
    3           0           1           0        1          0        0        1  42000.0         44        0           1           1          1           0 -0.554135 -0.924349 -0.796258 -0.579751  -0.743515 -0.086301
    4           0           1           0        1          1        0        1  55000.0          4        0           1           1          0           0  1.804616  0.242611 -0.796258 -0.579751   0.734297 -1.400254
    5           0           1           0        1          0        0        1  41000.0         14        0           1           1          1           0 -0.554135 -0.924349 -0.796258 -0.579751  -0.743515  0.404544
    6           0           0           1        1          0        0        1  43000.0         40        0           1           0          1           1 -0.554135 -0.924349 -0.796258 -0.579751  -0.743515 -0.031553
    7           0           0           0        1          0        0        1  62500.0         18        0           1           1          1           1 -0.554135  0.242611 -0.796258 -0.579751   0.734297 -0.586585
    8           1           0           0        1          0        0        1  40000.0         29        0           0           1          1           1 -0.554135  0.242611  0.389501 -0.579751  -0.743515 -1.176543
    9           0           0           0        1          0        0        1  61000.0         10        0           1           1          1           1 -0.554135  0.242611 -0.796258 -0.579751   0.734297 -0.369481
    46 rows X 20 columns
    2025-11-04 01:28:23,602 | INFO     | Updated dataset after performing PCA feature selection :
       automl_id     col_0     col_1     col_2     col_3     col_4     col_5     col_6     col_7     col_8     col_9    price
    0         20 -1.781932  0.357389 -0.804165  0.781302  0.423531  0.385514 -0.392667 -0.173088  0.116668 -0.422125  32500.0
    1         40 -1.237917  0.311520 -0.691958  0.630772  0.148798  0.582022 -0.088701  0.240978  1.212262 -0.546010  43000.0
    2         24 -0.537790 -0.121912  0.965847 -0.607395  2.236753  0.806179  0.160764  0.209506 -1.265151 -0.373994  64900.0
    3         18 -1.134034  0.386400  0.555704 -0.504103 -0.711792  0.072424 -0.284142  0.277344  0.185648  0.015771  62500.0
    4         59 -1.595203  0.216056 -0.793669  0.752214  0.263594  0.451237 -0.432608 -0.143301 -0.539837  0.138279  47000.0
    5         29 -0.620205  0.622530 -0.446559  0.092160  0.180652 -1.147594  0.789457  0.734782  0.044372  0.733643  40000.0
    6         21 -1.717824  0.310649 -0.738037  0.754847  0.325761  0.275516 -0.430316 -0.180066 -0.561117  0.133578  44500.0
    7         10 -1.029580  0.305820  0.508314 -0.506346 -0.764749  0.222112 -0.286095  0.308663  0.203776  0.019775  61000.0
    8         37 -1.491236  0.481402 -0.667688  0.699467  0.245828  0.252861  0.148402 -1.429806 -0.468139  0.210214  37900.0
    9         23  0.199440  0.655693  0.973420 -1.287750  1.822717 -0.705320  0.523943  0.663558  0.058868 -1.033697  78500.0
    10 rows X 12 columns
    2025-11-04 01:28:23,901 | INFO     | Data Transformation completed.⫿⫿⫿⫿⫿| 100% - 14/14
    2025-11-04 01:28:24,428 | INFO     | Following model is being picked for evaluation:
    2025-11-04 01:28:24,429 | INFO     | Model ID : XGBOOST_0
    2025-11-04 01:28:24,430 | INFO     | Feature Selection Method : rfe
    2025-11-04 01:28:24,960 | INFO     | Applying SHAP for Model Interpretation...
    2025-11-04 01:28:26,853 | INFO     | SHAP Analysis Completed. Feature Importance Available.
    /root/automl_testing/pyTeradata/teradataml/automl/model_evaluation.py:380: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
      plt.show()
    2025-11-04 01:28:29,243 | INFO     | Prediction :
       automl_id    Prediction  Confidence_Lower  Confidence_upper    price
    0         59  44644.431089      44644.431089      44644.431089  47000.0
    1         37  46045.859757      46045.859757      46045.859757  37900.0
    2         11  75321.783121      75321.783121      75321.783121  68000.0
    3         44  42945.364286      42945.364286      42945.364286  42000.0
    4          4  55707.416681      55707.416681      55707.416681  55000.0
    5         14  42619.512243      42619.512243      42619.512243  41000.0
    6         40  39995.145103      39995.145103      39995.145103  43000.0
    7         18  62810.532437      62810.532437      62810.532437  62500.0
    8         29  40034.885289      40034.885289      40034.885289  40000.0
    9         10  62810.532437      62810.532437      62810.532437  61000.0
    >>> prediction.head()
       automl_id    Prediction  Confidence_Lower  Confidence_upper    price
    0         29  40034.885289      40034.885289      40034.885289  40000.0
    1         23  65274.280393      65274.280393      65274.280393  78500.0
    2         34  69785.116206      69785.116206      69785.116206  86900.0
    3         27  43980.181898      43980.181898      43980.181898  43000.0
    4          7  61245.505123      61245.505123      61245.505123  70000.0
    5         12  76062.036134      76062.036134      76062.036134  72000.0
    6         20  42655.233172      42655.233172      42655.233172  32500.0
    7         24  61885.319803      61885.319803      61885.319803  64900.0
    8         59  44644.431089      44644.431089      44644.431089  47000.0
    9         21  41295.905196      41295.905196      41295.905196  44500.0
  9. Generate evaluation metrics on test dataset using best performing model.
    >>> performance_metrics = aml.evaluate(housing_test)
    2025-11-04 01:30:44,506 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 01:30:45,050 | INFO     | Following model is being picked for evaluation:
    2025-11-04 01:30:45,051 | INFO     | Model ID : XGBOOST_0
    2025-11-04 01:30:45,051 | INFO     | Feature Selection Method : rfe
    2025-11-04 01:30:46,408 | INFO     | Performance Metrics :
               MAE           MSE      MSLE       MAPE      MPE         RMSE    RMSLE            ME       R2        EV          MPD       MGD
    0  6215.716247  7.214750e+07  0.021857  11.161124 -3.35438  8493.968396  0.14784  29444.576968  0.77593  0.776068  1173.996768  0.021215
    >>> performance_metrics
               MAE           MSE      MSLE       MAPE      MPE         RMSE    RMSLE            ME       R2        EV          MPD       MGD
    0  6215.716247  7.214750e+07  0.021857  11.161124 -3.35438  8493.968396  0.14784  29444.576968  0.77593  0.776068  1173.996768  0.021215