AutoML for regression with early stopping timer and metrics threshold - Example 1: Run AutoML for Regression Problem with Early Stopping Timer and Metrics Threshold - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
March 2024
Language
English (United States)
Last Update
2024-04-09
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

This example predicts the price of house based on different factors.

Run AutoML to get the best performing model with the following specifications:
  • Set early stopping criteria, that is, time limit to 300 sec and performance metrics R2 threshold value to 0.7.
  • Exclude ‘knn’ model from default model training list.
  • Opt for verbose level 2 to get detailed logging.
  1. Load the example dataset.
    >>> load_example_data("decisionforestpredict", ["housing_train", "housing_test"])
    
    >>> housing_train = DataFrame.from_table("housing_train")
    
    >>> housing_test = DataFrame.from_table("housing_test")
  2. Create an AutoML instance.
    >>> aml = AutoML(task_type="Regression",
                     exclude=['knn'],
                     verbose=2,
                     max_runtime_secs=300,
                     stopping_metric='R2',
                     stopping_tolerance=0.7)
  3. Fit the data.
    >>> aml.fit(housing_train,housing_train.price)
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    Feature Exploration started ...
    Column Summary:
    ColumnName    Datatype    NonNullCount    NullCount    BlankCount    ZeroCount    PositiveCount    NegativeCount    NullPercentage    NonNullPercentage
    sn    INTEGER    492    0    None    0    492    0    0.0    100.0
    recroom    VARCHAR(10) CHARACTER SET LATIN    492    0    0    None    None    None    0.0    100.0
    fullbase    VARCHAR(10) CHARACTER SET LATIN    492    0    0    None    None    None    0.0    100.0
    bedrooms    INTEGER    492    0    None    0    492    0    0.0    100.0
    bathrms    INTEGER    492    0    None    0    492    0    0.0    100.0
    stories    INTEGER    492    0    None    0    492    0    0.0    100.0
    price    FLOAT    492    0    None    0    492    0    0.0    100.0
    lotsize    FLOAT    492    0    None    0    492    0    0.0    100.0
    homestyle    VARCHAR(20) CHARACTER SET LATIN    492    0    0    None    None    None    0.0    100.0
    driveway    VARCHAR(10) CHARACTER SET LATIN    492    0    0    None    None    None    0.0    100.0
    Statistics of Data:
    func    sn    price    lotsize    bedrooms    bathrms    stories    garagepl
    min    1    25000    1650    1    1    1    0
    std    159.501    26472.496    2182.443    0.731    0.51    0.861    0.854
    25%    132.5    49975    3600    2    1    1    0
    50%    274    62000    4616    3    1    2    0
    75%    413.25    82000    6370    3    2    2    1
    max    546    190000    16200    6    4    4    3
    mean    272.943    68100.396    5181.795    2.965    1.293    1.803    0.685
    count    492    492    492    492    492    492    492
    
    Categorical Columns with their Distinct values:
    ColumnName                DistinctValueCount
    driveway                  2         
    recroom                   2         
    fullbase                  2         
    gashw                     2         
    airco                     2         
    prefarea                  2         
    homestyle                 3         
    
    No Futile columns found.
    
    Target Column Distribution:
    
    Columns with outlier percentage :-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
      ColumnName  OutlierPercentage
    0   bedrooms           2.235772
    1    lotsize           2.235772
    2    stories           7.113821
    3      price           2.439024
    4   garagepl           2.235772
    5    bathrms           0.203252
                                                                                            
    
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
                                                                                            
    Feature Engineering started ...
                                                                                            
    Handling duplicate records present in dataset ...
                                                                                            
    Updated dataset after removing duplicate records:
    sn    price    lotsize    bedrooms    bathrms    stories    driveway    recroom    fullbase    gashw    airco    garagepl    prefarea    homestyle
    265    50000.0    3640.0    2    1    1    yes    no    no    no    no    1    no    Classic
    427    49500.0    5320.0    2    1    1    yes    no    no    no    no    1    yes    Classic
    223    70100.0    4200.0    3    1    2    yes    no    no    no    no    1    no    Eclectic
    122    80000.0    10500.0    4    2    2    yes    no    no    no    no    1    no    Eclectic
    80    63900.0    6360.0    2    1    1    yes    no    yes    no    yes    1    no    Eclectic
    345    88000.0    4500.0    3    1    4    yes    no    no    no    yes    0    no    Eclectic
    183    58000.0    4340.0    3    1    1    yes    no    no    no    no    0    no    Eclectic
    509    87000.0    8372.0    3    1    3    yes    no    no    no    yes    2    no    Eclectic
    326    99000.0    8880.0    3    2    2    yes    no    yes    no    yes    1    no    Eclectic
    305    60000.0    5800.0    3    1    1    yes    no    no    yes    no    2    no    Eclectic
                                                                                            
    Handling less significant features from data ...
    All categorical columns seem to be significant.                                         
                                                                                            
    Total time to handle less significant features: 14.29 sec
                                                                                             
    Handling Date Features ...
    Dataset does not contain any feature related to dates.                                   
                                                                                             
    Total time to handle date features: 0.00 sec
                                                                                             
    Checking Missing values in dataset ...
    No Missing Value Detected.                                                               
                                                                                             
    Total time to find missing values in data: 6.46 sec
                                                                                             
    Imputing Missing Values ...
    No imputation is Required.                                                               
                                                                                             
    Time taken to perform imputation: 0.01 sec
                                                                                             
    Performing encoding for categorical columns ...
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710262133653549"'8
                                                                                             
    ONE HOT Encoding these Columns:
    ['driveway', 'recroom', 'fullbase', 'gashw', 'airco', 'prefarea', 'homestyle']
                                                                                             
    Time taken to encode the columns: 10.77 sec
                                                                                             
    
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
                                                                                             
    Data preparation started ...
                                                                                             
    Spliting of dataset into training and testing ...
    Training size : 0.8                                                                      
    Testing size  : 0.2                                                                      
                                                                                             
    Training data
    sn    price    lotsize    bedrooms    bathrms    stories    driveway_0    driveway_1    recroom_0    recroom_1    fullbase_0    fullbase_1    gashw_0    gashw_1    airco_0    airco_1    garagepl    prefarea_0    prefarea_1    homestyle_0    homestyle_1    homestyle_2    id
    183    58000.0    4340.0    3    1    1    0    1    1    0    1    0    1    0    1    0    0    1    0    0    0    1    8
    80    63900.0    6360.0    2    1    1    0    1    1    0    0    1    1    0    0    1    1    1    0    0    0    1    12
    345    88000.0    4500.0    3    1    4    0    1    1    0    1    0    1    0    0    1    0    1    0    0    0    1    20
    40    54500.0    3150.0    2    2    1    1    0    1    0    0    1    1    0    1    0    0    1    0    0    0    1    10
    122    80000.0    10500.0    4    2    2    0    1    1    0    1    0    1    0    1    0    1    1    0    0    0    1    11
    387    83900.0    11460.0    3    1    3    0    1    1    0    1    0    1    0    1    0    2    0    1    0    0    1    19
    326    99000.0    8880.0    3    2    2    0    1    1    0    0    1    1    0    0    1    1    1    0    0    0    1    13
    305    60000.0    5800.0    3    1    1    0    1    1    0    1    0    0    1    1    0    2    1    0    0    0    1    21
    61    48000.0    4120.0    2    1    2    0    1    1    0    1    0    1    0    1    0    0    1    0    0    1    0    14
    244    27000.0    3649.0    2    1    1    0    1    1    0    1    0    1    0    1    0    0    1    0    0    1    0    22
                                                                                             
    Testing data
    sn    price    lotsize    bedrooms    bathrms    stories    driveway_0    driveway_1    recroom_0    recroom_1    fullbase_0    fullbase_1    gashw_0    gashw_1    airco_0    airco_1    garagepl    prefarea_0    prefarea_1    homestyle_0    homestyle_1    homestyle_2    id
    448    120000.0    5500.0    4    2    2    0    1    1    0    0    1    1    0    0    1    1    0    1    1    0    0    27
    488    44100.0    8100.0    2    1    1    0    1    1    0    1    0    1    0    1    0    1    1    0    0    1    0    30
    154    42000.0    3600.0    3    1    2    1    0    1    0    1    0    1    0    1    0    1    1    0    0    1    0    126
    200    52000.0    3570.0    3    1    2    0    1    1    0    0    1    1    0    1    0    0    1    0    0    0    1    26
    202    53900.0    2520.0    5    2    1    1    0    1    0    0    1    1    0    0    1    1    1    0    0    0    1    31
    32    48000.0    3500.0    4    1    2    0    1    1    0    1    0    1    0    0    1    2    1    0    0    1    0    127
    366    99000.0    13200.0    2    1    1    0    1    1    0    0    1    0    1    1    0    1    1    0    0    0    1    24
    93    163000.0    7420.0    4    1    2    0    1    0    1    0    1    1    0    0    1    2    1    0    1    0    0    120
    385    78000.0    6600.0    4    2    2    0    1    0    1    0    1    1    0    1    0    0    0    1    0    0    1    29
    112    46500.0    4500.0    2    1    1    1    0    1    0    1    0    1    0    1    0    0    1    0    0    1    0    125
                                                                                             
    Time taken for spliting of data: 6.53 sec
                                                                                             
    Outlier preprocessing ...
    Columns with outlier percentage :-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
      ColumnName  OutlierPercentage
    0    lotsize           2.235772
    1   garagepl           2.235772
    2    bathrms           0.203252
    3    stories           7.113821
    4      price           2.439024
    5   bedrooms           2.235772
                                                                                             
    Deleting rows of these columns:
    ['price', 'bathrms', 'garagepl', 'bedrooms', 'lotsize', 'stories']
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710259938573126"'8
                                                                                             
    Time Taken by Outlier processing: 32.06 sec
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710260626535532"'8
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710260897572252"'
                                                                                              
    Feature selection using lasso ...
                                                                                              
    feature selected by lasso:
    ['stories', 'recroom_0', 'homestyle_0', 'fullbase_0', 'airco_1', 'recroom_1', 'fullbase_1', 'bathrms', 'airco_0', 'prefarea_1', 'homestyle_1', 'sn', 'driveway_0', 'garagepl', 'driveway_1', 'lotsize']
                                                                                              
    Total time taken by feature selection: 1.37 sec
                                                                                              
    scaling Features of lasso data ...
                                                                                              
    columns that will be scaled:
    ['stories', 'bathrms', 'sn', 'garagepl', 'lotsize']
                                                                                              
    Training dataset after scaling:
    prefarea_1    airco_0    homestyle_1    recroom_0    homestyle_0    fullbase_0    driveway_0    recroom_1    price    airco_1    id    fullbase_1    driveway_1    stories    bathrms    sn    garagepl    lotsize
    1    1    0    1    0    0    0    0    82900.0    0    360    1    1    -1.1754677333050345    -0.5026028286234502    0.7528119657603072    1.806757867694828    1.050843746986432
    0    1    0    1    0    0    0    0    75000.0    0    199    1    1    -1.1754677333050345    1.6905731508243318    0.2069179250063654    -0.762853321915594    0.2696561824376743
    1    1    0    1    0    1    0    0    60000.0    0    512    0    1    1.8337296639558538    -0.5026028286234502    1.2152163296930578    0.521952272889617    -1.477600003603047
    1    0    1    1    0    1    1    0    31900.0    1    463    0    0    -1.1754677333050345    -0.5026028286234502    0.8491462082462968    -0.762853321915594    0.16549784049783994
    0    1    0    1    0    1    0    0    50500.0    0    398    0    1    -1.1754677333050345    -0.5026028286234502    -0.9490929848255112    1.806757867694828    -0.5896501385659592
    1    0    0    1    0    1    0    0    95000.0    1    107    0    1    1.8337296639558538    1.6905731508243318    0.48949836963193527    -0.762853321915594    0.7487845553609124
    1    1    1    1    0    1    0    0    49500.0    0    75    0    1    1.8337296639558538    -0.5026028286234502    1.176682632698662    -0.762853321915594    -1.477600003603047
    0    1    0    1    0    1    0    0    69000.0    0    416    0    1    1.8337296639558538    -0.5026028286234502    -1.7583006217078248    -0.762853321915594    -0.4282047085592159
    0    0    0    1    0    1    0    0    84000.0    1    136    0    1    1.8337296639558538    1.6905731508243318    1.6519315622962112    -0.762853321915594    0.7904478921368461
    0    1    1    1    0    1    0    0    49000.0    0    256    0    1    -1.1754677333050345    -0.5026028286234502    0.41243097564314346    -0.762853321915594    -0.5896501385659592
                                                                                              
    Testing dataset after scaling:
    prefarea_1    airco_0    homestyle_1    recroom_0    homestyle_0    fullbase_0    driveway_0    recroom_1    price    airco_1    id    fullbase_1    driveway_1    stories    bathrms    sn    garagepl    lotsize
    1    1    0    1    0    0    0    0    93000.0    0    454    1    1    1.8337296639558538    -0.5026028286234502    1.240905461022655    -0.762853321915594    0.8789824827857053
    0    0    0    0    1    1    0    1    103000.0    1    368    0    1    3.338328362586298    1.6905731508243318    1.5363304713130235    0.521952272889617    1.4049821095818689
    0    1    0    1    0    1    0    0    87250.0    0    492    0    1    1.8337296639558538    -0.5026028286234502    -0.36466524707717357    0.521952272889617    -0.907333081482454
    0    1    0    1    0    1    0    0    52500.0    0    204    0    1    -1.1754677333050345    -0.5026028286234502    -0.08850708528400306    -0.762853321915594    -0.7198480659907521
    1    0    0    1    0    1    0    0    89900.0    1    484    0    1    1.8337296639558538    1.6905731508243318    0.8234570769166996    -0.762853321915594    0.8425270631067633
    0    1    0    0    0    0    1    1    72000.0    0    211    1    0    -1.1754677333050345    -0.5026028286234502    -0.4674217723955626    -0.762853321915594    -0.7510955685727024
    1    0    0    1    1    1    0    0    112000.0    1    200    0    1    3.338328362586298    1.6905731508243318    0.8427239254138975    -0.762853321915594    0.7175370527789621
    0    1    1    1    0    1    0    0    30000.0    0    247    0    1    -1.1754677333050345    -0.5026028286234502    0.04636085419638255    0.521952272889617    -0.8448380763185533
    0    0    0    1    1    1    0    0    120000.0    1    417    0    1    3.338328362586298    -0.5026028286234502    1.606975582469416    1.806757867694828    1.050843746986432
    0    1    1    1    0    1    1    0    41000.0    0    187    0    0    -1.1754677333050345    -0.5026028286234502    -1.5078315912442515    -0.762853321915594    -1.0114914234222883
                                                                                              
    Total time taken by feature scaling: 42.16 sec
                                                                                              
    Feature selection using rfe ...
                                                                                              
    feature selected by RFE:
    ['homestyle_0', 'homestyle_2', 'airco_1', 'bathrms', 'homestyle_1', 'sn', 'garagepl', 'lotsize']
                                                                                              
    Total time taken by feature selection: 37.07 sec
                                                                                              
    scaling Features of rfe data ...
                                                                                              
    columns that will be scaled:
    ['r_bathrms', 'r_sn', 'r_garagepl', 'r_lotsize']
                                                                                              
    Training dataset after scaling:
    r_airco_1    r_homestyle_0    price    r_homestyle_1    id    r_homestyle_2    r_bathrms    r_sn    r_garagepl    r_lotsize
    0    0    82900.0    0    360    1    -0.5026028286234502    0.7528119657603072    1.806757867694828    1.050843746986432
    0    0    75000.0    0    199    1    1.6905731508243318    0.2069179250063654    -0.762853321915594    0.2696561824376743
    0    0    60000.0    0    512    1    -0.5026028286234502    1.2152163296930578    0.521952272889617    -1.477600003603047
    1    0    31900.0    1    463    0    -0.5026028286234502    0.8491462082462968    -0.762853321915594    0.16549784049783994
    0    0    50500.0    0    398    1    -0.5026028286234502    -0.9490929848255112    1.806757867694828    -0.5896501385659592
    1    0    95000.0    0    107    1    1.6905731508243318    0.48949836963193527    -0.762853321915594    0.7487845553609124
    0    0    49500.0    1    75    0    -0.5026028286234502    1.176682632698662    -0.762853321915594    -1.477600003603047
    0    0    69000.0    0    416    1    -0.5026028286234502    -1.7583006217078248    -0.762853321915594    -0.4282047085592159
    1    0    84000.0    0    136    1    1.6905731508243318    1.6519315622962112    -0.762853321915594    0.7904478921368461
    0    0    49000.0    1    256    0    -0.5026028286234502    0.41243097564314346    -0.762853321915594    -0.5896501385659592
                                                                                              
    Testing dataset after scaling:
    r_airco_1    r_homestyle_0    price    r_homestyle_1    id    r_homestyle_2    r_bathrms    r_sn    r_garagepl    r_lotsize
    0    0    93000.0    0    454    1    -0.5026028286234502    1.240905461022655    -0.762853321915594    0.8789824827857053
    1    1    103000.0    0    368    0    1.6905731508243318    1.5363304713130235    0.521952272889617    1.4049821095818689
    0    0    87250.0    0    492    1    -0.5026028286234502    -0.36466524707717357    0.521952272889617    -0.907333081482454
    0    0    52500.0    0    204    1    -0.5026028286234502    -0.08850708528400306    -0.762853321915594    -0.7198480659907521
    1    0    89900.0    0    484    1    1.6905731508243318    0.8234570769166996    -0.762853321915594    0.8425270631067633
    0    0    72000.0    0    211    1    -0.5026028286234502    -0.4674217723955626    -0.762853321915594    -0.7510955685727024
    1    1    112000.0    0    200    0    1.6905731508243318    0.8427239254138975    -0.762853321915594    0.7175370527789621
    0    0    30000.0    1    247    0    -0.5026028286234502    0.04636085419638255    0.521952272889617    -0.8448380763185533
    1    1    120000.0    0    417    0    -0.5026028286234502    1.606975582469416    1.806757867694828    1.050843746986432
    0    0    41000.0    1    187    0    -0.5026028286234502    -1.5078315912442515    -0.762853321915594    -1.0114914234222883
                                                                                              
    Total time taken by feature scaling: 35.98 sec
                                                                                              
    scaling Features of pca data ...
                                                                                              
    columns that will be scaled:
    ['sn', 'lotsize', 'bathrms', 'stories', 'garagepl']
                                                                                              
    Training dataset after scaling:
    airco_0    prefarea_1    homestyle_1    recroom_0    homestyle_0    prefarea_0    fullbase_0    gashw_0    driveway_0    gashw_1    homestyle_2    price    airco_1    bedrooms    id    fullbase_1    recroom_1    driveway_1    sn    lotsize    bathrms    stories    garagepl
    0    0    0    1    0    1    0    1    1    0    1    57000.0    1    3    17    1    0    0    -1.1610283182946874    -0.25113552726149785    1.690573150824329    0.32913096532540864    -0.7628533219155942
    1    0    0    1    0    1    1    1    0    0    1    70100.0    0    3    23    0    0    1    -0.3775098127419719    -0.40737304017124965    -0.5026028286234494    0.32913096532540864    0.5219522728896171
    0    0    0    0    0    1    1    1    0    0    1    88500.0    1    3    71    0    1    1    0.2582961876655597    0.7123291356819713    1.690573150824329    1.8337296639558482    -0.7628533219155942
    1    0    0    1    0    1    1    1    0    0    1    56000.0    0    3    38    0    0    1    -0.9041370049987152    -1.0323230918102566    -0.5026028286234494    0.32913096532540864    -0.7628533219155942
    1    0    1    1    0    1    1    1    0    0    0    35500.0    0    3    51    0    0    1    -1.4307641972554586    -0.3032146982314151    -0.5026028286234494    0.32913096532540864    -0.7628533219155942
    0    0    0    1    0    1    1    1    0    0    1    98000.0    1    3    59    0    0    1    0.2711407533303583    0.5300520372872609    -0.5026028286234494    -1.175467733305031    0.5219522728896171
    1    1    0    1    0    0    1    1    0    0    1    65500.0    0    3    44    0    0    1    0.9197913194026884    -0.5948580556629517    -0.5026028286234494    0.32913096532540864    0.5219522728896171
    0    0    0    1    0    1    0    1    0    0    1    94500.0    1    3    52    1    0    1    -1.0518495101438992    -0.5115313821110842    1.690573150824329    0.32913096532540864    0.5219522728896171
    0    0    0    1    0    1    0    1    0    0    1    99000.0    1    3    13    1    0    1    0.28398531899515694    2.029932161220878    1.690573150824329    0.32913096532540864    0.5219522728896171
    1    0    0    1    0    1    1    0    0    1    1    60000.0    0    3    21    0    0    1    0.14911737951477144    0.42589369534742644    -0.5026028286234494    -1.175467733305031    1.8067578676948284
                                                                                              
    Testing dataset after scaling:
    airco_0    prefarea_1    homestyle_1    recroom_0    homestyle_0    prefarea_0    fullbase_0    gashw_0    driveway_0    gashw_1    homestyle_2    price    airco_1    bedrooms    id    fullbase_1    recroom_1    driveway_1    sn    lotsize    bathrms    stories    garagepl
    1    0    1    1    0    1    1    1    0    0    0    45000.0    0    2    25    0    0    1    0.014249440034385967    0.9206458195616404    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    1    0    1    1    0    1    1    1    0    0    0    44100.0    0    2    30    0    0    1    1.324395137843845    1.6237146276555232    -0.5026028286234494    -1.175467733305031    0.5219522728896171
    1    0    1    1    0    1    1    1    1    0    0    42000.0    0    3    126    0    0    0    -0.8206473281775242    -0.7198480659907531    -0.5026028286234494    0.32913096532540864    0.5219522728896171
    1    0    0    1    0    1    0    1    0    0    1    52000.0    0    3    26    1    0    1    -0.525222317887156    -0.7354718172817283    -0.5026028286234494    0.32913096532540864    -0.7628533219155942
    0    0    0    1    0    1    1    1    0    0    1    52000.0    1    2    28    0    0    1    -0.255486438926385    -1.0323230918102566    -0.5026028286234494    0.32913096532540864    -0.7628533219155942
    1    1    0    1    0    0    0    1    0    0    1    84000.0    0    3    124    1    0    1    0.6243663091123203    1.1341704205383012    -0.5026028286234494    -1.175467733305031    1.8067578676948284
    0    0    0    1    0    1    0    1    1    0    1    53900.0    1    5    31    1    0    0    -0.5123777522223574    -1.2823031124658595    1.690573150824329    -1.175467733305031    0.5219522728896171
    0    0    1    1    0    1    1    1    0    0    0    48000.0    1    4    127    0    0    1    -1.6041658337302398    -0.7719272369606704    -0.5026028286234494    0.32913096532540864    1.8067578676948284
    1    1    0    0    0    0    0    1    0    0    1    78000.0    0    4    29    1    1    1    0.6629000061067162    0.8425270631067645    1.690573150824329    0.32913096532540864    -0.7628533219155942
    1    0    1    1    0    1    1    1    1    0    0    46500.0    0    2    125    0    0    0    -1.090383207138295    -0.25113552726149785    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
                                                                                              
    Total time taken by feature scaling: 42.09 sec
                                                                                              
    Dimension Reduction using pca ...
                                                                                              
    PCA columns:
    ['col_0', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5', 'col_6', 'col_7', 'col_8', 'col_9']
                                                                                              
    Total time taken by PCA: 8.92 sec
                                                                                              
    
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
                                                                                              
    Model Training started ...
                                                                                              
    Hyperparameters used for model training:
    response_column : price                                                                                                                               
    name : glm
    family : GAUSSIAN
    lambda1 : (0.001, 0.02, 0.1)
    alpha : (0.15, 0.85)
    learning_rate : ('invtime', 'constant', 'adaptive')
    initial_eta : (0.05, 0.1)
    momentum : (0.65, 0.8, 0.95)
    iter_num_no_change : (5, 10, 50)
    iter_max : (300, 200, 400, 500)
    batch_size : (10, 80, 100, 150)
    Total number of models for glm : 5184
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    response_column : price
    name : svm
    model_type : regression
    lambda1 : (0.001, 0.02, 0.1)
    alpha : (0.15, 0.85)
    tolerance : (0.001, 0.01)
    learning_rate : ('Invtime', 'Adaptive', 'constant')
    initial_eta : (0.05, 0.1)
    momentum : (0.65, 0.8, 0.95)
    nesterov : True
    intercept : True
    iter_num_no_change : (5, 10, 50)
    local_sgd_iterations  : (10, 20)
    iter_max : (300, 200, 400, 500)
    batch_size : (10, 80, 100, 150)
    Total number of models for svm : 20736
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    response_column : price
    name : decision_forest
    tree_type : Regression
    min_impurity : (0.0, 0.1, 0.2, 0.3)
    max_depth : (5, 3, 4, 7, 8)
    min_node_size : (1, 2, 3, 4)
    num_trees : (-1, 20, 30, 40)
    Total number of models for decision_forest : 320
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
    response_column : price
    name : xgboost
    model_type : Regression
    column_sampling : (1, 0.6)
    min_impurity : (0.0, 0.1, 0.2, 0.3)
    lambda1 : (0.01, 0.1, 1, 10)
    shrinkage_factor : (0.5, 0.01, 0.05, 0.1)
    max_depth : (5, 3, 4, 7, 8)
    min_node_size : (1, 2, 3, 4)
    iter_num : (10, 20, 30, 40)
    Total number of models for xgboost : 10240
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    
                                                                                              
    Performing hyperParameter tuning ...
                                                                                              
    glm
    GLM_3                                                                                                                                                                                                   
    GLM_1                                                                                     
    GLM_2                                                                                     
                                                                                              
    ----------------------------------------------------------------------------------------------------
                                                                                              
    svm
    SVM_0                                                                                                                                                                                                   
    SVM_1                                                                                     
    SVM_2                                                                                     
                                                                                              
    ----------------------------------------------------------------------------------------------------
                                                                                              
    decision_forest
    DECISIONFOREST_3                                                                                                                                                                                        
    DECISIONFOREST_1                                                                          
    DECISIONFOREST_2                                                                          
                                                                                              
    ----------------------------------------------------------------------------------------------------
                                                                                              
    xgboost
    XGBOOST_0                                                                                                                                                                                               
    XGBOOST_1                                                                                 
    XGBOOST_2                                                                                 
                                                                                              
    ----------------------------------------------------------------------------------------------------
                                                                                              
    Evaluating models performance ...
                                                                                              
    Evaluation completed.
                                                                                              
    Leaderboard
    Rank    Name    Feature selection    MAE    MSE    MSLE    RMSE    RMSLE    R2-score    Adjusted R2-score
    0    1    decision_forest    rfe    9930.233002    1.635865e+08    0.035966    12790.094260    0.189647    0.761679    0.740495
    1    2    xgboost    rfe    9744.413194    1.727394e+08    0.036098    13143.036126    0.189995    0.748345    0.725976
    2    3    glm    lasso    10401.540433    1.900101e+08    0.037991    13784.416746    0.194913    0.723184    0.669171
    3    4    glm    pca    13337.634845    3.197805e+08    0.065841    17882.407498    0.256596    0.534129    0.481189
    4    5    xgboost    pca    14288.786873    3.898397e+08    0.069274    19744.359394    0.263200    0.432063    0.367525
    5    6    glm    rfe    14782.291316    3.936392e+08    0.074210    19840.341792    0.272416    0.426528    0.375553
    6    7    xgboost    lasso    17089.158652    4.147968e+08    0.111265    20366.561238    0.333563    0.395704    0.277793
    7    8    decision_forest    lasso    16584.225589    4.211774e+08    0.088081    20522.608087    0.296785    0.386409    0.266684
    8    9    decision_forest    pca    17247.406355    5.469421e+08    0.097807    23386.793407    0.312741    0.203189    0.112642
    9    10    svm    lasso    66222.818182    5.071993e+09    51.335876    71217.925928    7.164906    -6.389119    -7.830899
    10    11    svm    rfe    66249.909091    5.075714e+09    61.779701    71244.046035    7.860006    -6.394540    -7.051833
    11    12    svm    pca    66271.212121    5.078287e+09    0.000000    71262.102818    0.000000    -6.398289    -7.239004
                                                                                              
    
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    Completed: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿| 100% - 18/18
  4. Display model leaderboard.
    >>> aml.leaderboard()
    
    Rank    Name    Feature selection    MAE    MSE    MSLE    RMSE    RMSLE    R2-score    Adjusted R2-score
    0    1    decision_forest    rfe    9930.233002    1.635865e+08    0.035966    12790.094260    0.189647    0.761679    0.740495
    1    2    xgboost    rfe    9744.413194    1.727394e+08    0.036098    13143.036126    0.189995    0.748345    0.725976
    2    3    glm    lasso    10401.540433    1.900101e+08    0.037991    13784.416746    0.194913    0.723184    0.669171
    3    4    glm    pca    13337.634845    3.197805e+08    0.065841    17882.407498    0.256596    0.534129    0.481189
    4    5    xgboost    pca    14288.786873    3.898397e+08    0.069274    19744.359394    0.263200    0.432063    0.367525
    5    6    glm    rfe    14782.291316    3.936392e+08    0.074210    19840.341792    0.272416    0.426528    0.375553
    6    7    xgboost    lasso    17089.158652    4.147968e+08    0.111265    20366.561238    0.333563    0.395704    0.277793
    7    8    decision_forest    lasso    16584.225589    4.211774e+08    0.088081    20522.608087    0.296785    0.386409    0.266684
    8    9    decision_forest    pca    17247.406355    5.469421e+08    0.097807    23386.793407    0.312741    0.203189    0.112642
    9    10    svm    lasso    66222.818182    5.071993e+09    51.335876    71217.925928    7.164906    -6.389119    -7.830899
    10    11    svm    rfe    66249.909091    5.075714e+09    61.779701    71244.046035    7.860006    -6.394540    -7.051833
    11    12    svm    pca    66271.212121    5.078287e+09    0.000000    71262.102818    0.000000    -6.398289    -7.239004
  5. Display the best performing model.
    >>> aml.leader()
        Rank    Name    Feature selection    MAE    MSE    MSLE    RMSE    RMSLE    R2-score    Adjusted R2-score
    0    1    decision_forest    rfe    9930.233002    1.635865e+08    0.035966    12790.09426    0.189647    0.761679    0.740495
  6. Generate prediction on validation dataset using best performing model.
    In the data preparation phase, AutoML generates the validation dataset by splitting the data provided during fitting into training and testing sets. AutoML's model training utilizes the training data, with the testing data acting as the validation dataset for model evaluation.
    >>> prediction = aml.predict()
    decision_forest rfe
    
     Prediction : 
        id     prediction  confidence_lower  confidence_upper     price
    0  454   74165.441176      70852.176471      77478.705882   93000.0
    1  368  104000.000000     100080.000000     107920.000000  103000.0
    2  492   61219.117647      60789.647059      61648.588235   87250.0
    3  204   64219.117647      58768.588235      69669.647059   52500.0
    4  484   83956.250000      61453.000000     106459.500000   89900.0
    5  211   64219.117647      58768.588235      69669.647059   72000.0
    6  200  104000.000000     100080.000000     107920.000000  112000.0
    7  247   47604.166667      33108.333333      62100.000000   30000.0
    8  417  112500.000000      97800.000000     127200.000000  120000.0
    9  187   47200.000000      42692.000000      51708.000000   41000.0
    
     Performance Metrics : 
               MAE           MSE      MSLE       MAPE       MPE         RMSE     RMSLE       ME        R2        EV          MPD       MGD
    0  9930.233002  1.635865e+08  0.035966  16.235066 -7.393396  12790.09426  0.189647  41250.0  0.761679  0.766803  2248.333581  0.034367
    >>> prediction
    id     prediction     confidence_lower     confidence_upper     price
    26     64219.117647058825     58768.588235294126   69669.64705882352     52000.0
    28     61219.117647058825     60789.6470588241     61648.58823529355     52000.0
    29     83956.25               61453.0                       106459.5     78000.0
    30     62927.94117647059      37589.17647058825    88266.70588235292     44100.0
    120    121750.0               118320.0                     125180.0     163000.0
    121    83439.28571428571      72820.2857142858     94058.28571428562     86000.0
    31     74166.66666666666      66653.33333333302    81680.00000000029     53900.0
    27     104000.0               100080.0                      107920.0     120000.0
    25     52500.0                47600.0                        57400.0     45000.0
    24     83439.28571428571      72820.2857142858     94058.28571428562     99000.0
  7. Generate prediction on validation dataset using third best performing model.
    >>> prediction = aml.predict(rank=3)
    glm lasso
    
     Prediction : 
        id     prediction     price
    0  454   76894.368729   93000.0
    1  368  113540.345935  103000.0
    2  492   64256.667546   87250.0
    3  204   57466.752144   52500.0
    4  484   90752.661865   89900.0
    5  211   52318.433756   72000.0
    6  200  110188.679898  112000.0
    7  247   45344.048905   30000.0
    8  417  103117.250741  120000.0
    9  187   32865.739848   41000.0
    
     Performance Metrics : 
                MAE           MSE      MSLE       MAPE       MPE          RMSE     RMSLE           ME        R2        EV          MPD       MGD
    0  10401.540433  1.900101e+08  0.037991  16.511632 -6.284111  13784.416746  0.194913  59942.83679  0.723184  0.725968  2490.151073  0.037177
    >>> prediction
    id    prediction    price
    26    63427.82171987901     52000.0
    28    64876.404092324046    52000.0
    29    85693.30840720922     78000.0
    30    57405.305632655116    44100.0
    120   103057.16320961973   163000.0
    121   75011.0032186078      86000.0
    31    69803.61297655595     53900.0
    27    109401.81363540013   120000.0
    25    51185.80269704855     45000.0
    24    88189.77808769468     99000.0
  8. Generate prediction on test dataset using best performing model.
    >>> prediction = aml.predict(housing_test)
    Data Transformation started ...
    Performing transformation carried out in feature engineering phase ...
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710262235248271"'
    
    Updated dataset after performing categorical encoding :
    sn    price    lotsize    bedrooms    bathrms    stories    driveway_0    driveway_1    recroom_0    recroom_1    fullbase_0    fullbase_1    gashw_0    gashw_1    airco_0    airco_1    garagepl    prefarea_0    prefarea_1    homestyle_0    homestyle_1    homestyle_2    id
    469    55000.0    2176.0    2    1    2    0    1    0    1    1    0    1    0    1    0    0    0    1    0    0    1    8
    38    67000.0    5170.0    3    1    4    0    1    1    0    1    0    1    0    0    1    0    1    0    0    0    1    12
    198    40500.0    4350.0    3    1    2    1    0    1    0    1    0    0    1    1    0    1    1    0    0    1    0    20
    255    61000.0    4360.0    4    1    2    0    1    1    0    1    0    1    0    1    0    0    1    0    0    0    1    15
    260    41000.0    6000.0    2    1    1    0    1    1    0    1    0    1    0    1    0    0    1    0    0    1    0    10
    540    85000.0    7320.0    4    2    2    0    1    1    0    1    0    1    0    1    0    0    1    0    0    0    1    18
    251    48500.0    3450.0    3    1    1    0    1    1    0    0    1    1    0    1    0    2    1    0    0    1    0    14
    408    87500.0    6420.0    3    1    3    0    1    1    0    0    1    1    0    1    0    0    0    1    0    0    1    22
    463    49000.0    2610.0    3    1    2    0    1    1    0    0    1    1    0    1    0    0    0    1    0    1    0    13
    459    44555.0    2398.0    3    1    1    0    1    1    0    1    0    1    0    1    0    0    0    1    0    1    0    21
    Performing transformation carried out in data preparation phase ...
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710259910939684"'
    
    Updated dataset after performing Lasso feature selection:
    id    stories    recroom_0    homestyle_0    fullbase_0    airco_1    recroom_1    fullbase_1    bathrms    airco_0    prefarea_1    homestyle_1    sn    driveway_0    garagepl    driveway_1    lotsize    price
    40    1    0    0    0    1    1    1    1    0    1    0    401    0    2    1    7410.0    92500.0
    59    1    1    0    1    0    0    0    1    1    0    1    195    0    0    1    3180.0    33000.0
    75    1    1    0    1    0    0    0    1    1    0    1    294    0    0    1    4040.0    47000.0
    32    1    0    0    0    1    1    1    1    0    1    0    403    0    0    1    6825.0    77500.0
    48    1    1    0    1    0    0    0    1    1    0    1    25    0    0    1    4960.0    42000.0
    44    1    1    0    0    0    0    1    1    1    0    1    239    0    2    1    3000.0    26000.0
    12    4    1    0    1    1    0    0    1    0    0    0    38    0    0    1    5170.0    67000.0
    67    4    1    0    1    0    0    0    1    1    0    0    317    0    0    1    5000.0    80000.0
    22    3    1    0    0    0    0    1    1    1    1    0    408    0    0    1    6420.0    87500.0
    36    1    1    0    1    0    0    0    1    1    0    1    234    1    0    0    3970.0    32500.0
    
    Updated dataset after performing scaling on Lasso selected features :
    prefarea_1    airco_0    homestyle_1    recroom_0    homestyle_0    fullbase_0    driveway_0    recroom_1    price    airco_1    id    fullbase_1    driveway_1    stories    bathrms    sn    garagepl    lotsize
    1    1    0    1    0    0    0    0    87500.0    0    22    1    1    1.8337296639558538    -0.5026028286234502    0.8106125112519009    -0.762853321915594    0.7487845553609124
    0    1    0    0    0    0    0    1    64900.0    0    24    1    1    -1.1754677333050345    1.6905731508243318    -0.049973388289607165    -0.762853321915594    -0.45945221114116624
    1    0    0    0    0    0    0    1    92500.0    1    40    1    1    -1.1754677333050345    -0.5026028286234502    0.7656565314251058    1.806757867694828    1.2643683479630925
    0    1    1    1    0    0    0    0    48500.0    0    14    1    1    -1.1754677333050345    -0.5026028286234502    -0.1976858934347914    1.806757867694828    -0.7979668224456279
    0    1    1    1    0    1    0    0    47000.0    0    75    0    1    -1.1754677333050345    -0.5026028286234502    0.07847226835837913    -0.762853321915594    -0.49069971372311655
    1    0    0    0    0    0    0    1    77500.0    1    32    1    1    -1.1754677333050345    -0.5026028286234502    0.7785010970899044    -0.762853321915594    0.9597051977890769
    0    1    1    1    0    1    1    0    32500.0    0    36    0    0    -1.1754677333050345    -0.5026028286234502    -0.30686470158557977    -0.762853321915594    -0.5271551334020586
    0    1    1    1    0    1    0    0    42000.0    0    48    0    1    -1.1754677333050345    -0.5026028286234502    -1.6491218135570365    -0.762853321915594    -0.011571340799878474
    0    1    1    1    0    0    0    0    26000.0    0    44    1    1    -1.1754677333050345    -0.5026028286234502    -0.2747532874235832    1.806757867694828    -1.0323230918102553
    0    1    1    1    0    1    0    0    33000.0    0    59    0    1    -1.1754677333050345    -0.5026028286234502    -0.557333732049153    -0.762853321915594    -0.9385805840644043
    
    Updated dataset after performing RFE feature selection:
    id    homestyle_0    homestyle_2    airco_1    bathrms    homestyle_1    sn    garagepl    lotsize    price
    24    0    1    0    2    0    274    0    4100.0    64900.0
    14    0    0    0    1    1    251    2    3450.0    48500.0
    59    0    0    0    1    1    195    0    3180.0    33000.0
    75    0    0    0    1    1    294    0    4040.0    47000.0
    36    0    0    0    1    1    234    0    3970.0    32500.0
    48    0    0    0    1    1    25    0    4960.0    42000.0
    44    0    0    0    1    1    239    2    3000.0    26000.0
    12    0    1    1    1    0    38    0    5170.0    67000.0
    67    0    1    0    1    0    317    0    5000.0    80000.0
    32    0    1    1    1    0    403    0    6825.0    77500.0
    
    Updated dataset after performing scaling on RFE selected features :
    r_airco_1    r_homestyle_0    price    r_homestyle_1    id    r_homestyle_2    r_bathrms    r_sn    r_garagepl    r_lotsize
    0    0    44500.0    1    25    0    -0.5026028286234502    -0.21053045909959003    -0.762853321915594    -0.7719272369606693
    1    0    92500.0    0    40    1    -0.5026028286234502    0.7656565314251058    1.806757867694828    1.2643683479630925
    0    0    48500.0    1    14    0    -0.5026028286234502    -0.1976858934347914    1.806757867694828    -0.7979668224456279
    0    0    33000.0    1    59    0    -0.5026028286234502    -0.557333732049153    -0.762853321915594    -0.9385805840644043
    1    0    77500.0    0    32    1    -0.5026028286234502    0.7785010970899044    -0.762853321915594    0.9597051977890769
    0    0    32500.0    1    36    0    -0.5026028286234502    -0.30686470158557977    -0.762853321915594    -0.5271551334020586
    0    0    42000.0    1    48    0    -0.5026028286234502    -1.6491218135570365    -0.762853321915594    -0.011571340799878474
    0    0    26000.0    1    44    0    -0.5026028286234502    -0.2747532874235832    1.806757867694828    -1.0323230918102553
    0    0    87500.0    0    22    1    -0.5026028286234502    0.8106125112519009    -0.762853321915594    0.7487845553609124
    0    0    47000.0    1    75    0    -0.5026028286234502    0.07847226835837913    -0.762853321915594    -0.49069971372311655
    
    Updated dataset after performing scaling for PCA feature selection :
    airco_0    prefarea_1    homestyle_1    recroom_0    homestyle_0    prefarea_0    fullbase_0    gashw_0    driveway_0    gashw_1    homestyle_2    price    airco_1    bedrooms    id    fullbase_1    recroom_1    driveway_1    sn    lotsize    bathrms    stories    garagepl
    0    1    0    0    0    0    0    1    0    0    1    92500.0    1    3    40    1    1    1    0.765656531425105    1.2643683479630943    -0.5026028286234494    -1.175467733305031    1.8067578676948284
    1    0    1    1    0    1    1    1    0    0    0    33000.0    0    2    59    0    0    1    -0.5573337320491525    -0.9385805840644056    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    1    0    1    1    0    1    1    1    0    0    0    47000.0    0    2    75    0    0    1    0.07847226835837905    -0.4906997137231172    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    0    1    0    0    0    0    0    1    0    0    1    77500.0    1    3    32    1    1    1    0.7785010970899037    0.9597051977890783    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    1    0    1    1    0    1    1    1    0    0    0    42000.0    0    2    48    0    0    1    -1.649121813557035    -0.01157134079987849    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    1    0    1    1    0    1    0    1    0    0    0    26000.0    0    2    44    1    0    1    -0.2747532874235829    -1.0323230918102566    -0.5026028286234494    -1.175467733305031    1.8067578676948284
    1    1    0    1    0    0    0    1    0    0    1    87500.0    0    3    22    1    0    1    0.8106125112519003    0.7487845553609134    -0.5026028286234494    1.8337296639558482    -0.7628533219155942
    0    0    0    1    0    1    1    1    0    0    1    67000.0    1    3    12    0    0    1    -1.565632136735844    0.09779491823694775    -0.5026028286234494    3.338328362586288    -0.7628533219155942
    1    0    0    1    0    1    1    1    0    0    1    80000.0    0    3    67    0    0    1    0.22618477350356314    0.009260327588088412    -0.5026028286234494    3.338328362586288    -0.7628533219155942
    1    0    1    1    0    1    1    1    1    0    0    32500.0    0    1    36    0    0    0    -0.3068647015855795    -0.5271551334020593    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    
    Updated dataset after performing PCA feature selection :
    id    col_0    col_1    col_2    col_3    col_4    col_5    col_6    col_7    col_8    col_9    price
    0    25    -1.212708    -1.063731    -0.313326    -0.550789    0.092915    -1.034314    0.082146    -0.217252    0.379940    -0.544570    44500.0
    1    12    -1.702851    2.244144    -0.163064    1.531449    -1.726684    1.435539    -0.336577    0.312267    -0.721428    -0.019429    67000.0
    2    22    0.436066    1.674569    -1.184835    -0.517098    -0.149681    0.581587    -1.015373    0.346597    -1.125296    -0.054246    87500.0
    3    24    0.121058    -0.088188    1.823276    -1.499059    0.807124    -0.001108    -0.486189    -0.327831    0.999693    0.041307    64900.0
    4    67    -1.071889    2.634593    -1.143026    1.133870    -0.666936    0.448109    -1.183424    0.156071    -0.190941    -0.026246    80000.0
    5    40    2.732152    -1.196533    -0.243570    -0.130603    0.061535    0.992846    0.669891    0.796020    0.182694    0.034230    92500.0
    6    14    0.086457    -1.901373    0.016338    0.671400    1.402064    0.090815    0.377295    0.154887    0.009292    -1.147022    48500.0
    7    59    -1.474117    -1.166772    -0.160654    -0.468734    0.124279    -0.900933    0.088555    -0.189110    0.311384    -0.521604    33000.0
    8    75    -0.913410    -0.987968    -0.428719    -0.617452    -0.026463    -1.166772    0.029816    -0.223178    0.401544    -0.558936    47000.0
    9    32    1.447936    -0.419323    -0.367653    -1.974154    -0.669749    0.666961    0.361696    0.420576    0.244402    0.138265    77500.0
    
    Data Transformation completed.
    decision_forest rfe
    
     Prediction : 
       id    prediction  confidence_lower  confidence_upper    price
    0  40  83439.285714      72820.285714      94058.285714  92500.0
    1  59  45104.166667      35508.333333      54700.000000  33000.0
    2  75  47604.166667      33108.333333      62100.000000  47000.0
    3  32  74165.441176      70852.176471      77478.705882  77500.0
    4  48  47200.000000      41908.000000      52492.000000  42000.0
    5  44  44000.000000      32240.000000      55760.000000  26000.0
    6  12  55791.666667      52280.000000      59303.333333  67000.0
    7  67  65719.117647      57328.588235      74109.647059  80000.0
    8  22  74165.441176      70852.176471      77478.705882  87500.0
    9  36  45104.166667      35508.333333      54700.000000  32500.0
    
     Performance Metrics : 
               MAE           MSE      MSLE       MAPE       MPE         RMSE     RMSLE            ME        R2        EV         MPD       MGD
    0  7709.589423  8.996195e+07  0.030633  14.755182 -8.555023  9484.827403  0.175024  25564.583333  0.720603  0.742535  1540.87909  0.028678
    
    >>> prediction
    id    prediction    confidence_lower    confidence_upper    price
    10    50000.0              50000.0                        50000.0    41000.0
    12    55791.66666666667    52280.00000000015    59303.33333333319    67000.0
    13    62552.94117647059    36479.17647058826    88626.70588235292    49000.0
    14    45104.16666666667    35508.33333333341   54699.999999999935    48500.0
    16    76678.57142857142    52808.57142857135    100548.5714285715    72000.0
    17    37250.0                        23040.0              51460.0    27000.0
    15    64219.117647058825  58768.588235294126    69669.64705882352    61000.0
    11    67802.38095238095    47773.04761904759    87831.7142857143     68000.0
    9     65719.11764705883    57328.58823529422    74109.64705882344    55000.0
    8     68647.05882352941    54517.76470588235    82776.35294117648    55000.0
  9. Generate prediction on test dataset using second best performing model.
    >>> prediction = aml.predict(housing_test,2)
    
    Data Transformation started ...
    Performing transformation carried out in feature engineering phase ...
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710260227157732"'
    
    Updated dataset after performing categorical encoding :
    sn    price    lotsize    bedrooms    bathrms    stories    driveway_0    driveway_1    recroom_0    recroom_1    fullbase_0    fullbase_1    gashw_0    gashw_1    airco_0    airco_1    garagepl    prefarea_0    prefarea_1    homestyle_0    homestyle_1    homestyle_2    id
    53    68000.0    9166.0    2    1    1    0    1    1    0    0    1    1    0    0    1    2    1    0    0    0    1    11
    260    41000.0    6000.0    2    1    1    0    1    1    0    1    0    1    0    1    0    0    1    0    0    1    0    10
    540    85000.0    7320.0    4    2    2    0    1    1    0    1    0    1    0    1    0    0    1    0    0    0    1    18
    251    48500.0    3450.0    3    1    1    0    1    1    0    0    1    1    0    1    0    2    1    0    0    1    0    14
    463    49000.0    2610.0    3    1    2    0    1    1    0    0    1    1    0    1    0    0    0    1    0    1    0    13
    459    44555.0    2398.0    3    1    1    0    1    1    0    1    0    1    0    1    0    0    0    1    0    1    0    21
    301    55000.0    4080.0    2    1    1    0    1    1    0    1    0    1    0    1    0    0    1    0    0    0    1    9
    13    27000.0    1700.0    3    1    2    0    1    1    0    1    0    1    0    1    0    0    1    0    0    1    0    17
    255    61000.0    4360.0    4    1    2    0    1    1    0    1    0    1    0    1    0    0    1    0    0    0    1    15
    16    37900.0    3185.0    2    1    1    0    1    1    0    1    0    1    0    0    1    0    1    0    0    1    0    23
    Performing transformation carried out in data preparation phase ...
    result data stored in table '"AUTOML_USER"."ml__td_sqlmr_persist_out__1710261063385357"'
    
    Updated dataset after performing Lasso feature selection:
    id    stories    recroom_0    homestyle_0    fullbase_0    airco_1    recroom_1    fullbase_1    bathrms    airco_0    prefarea_1    homestyle_1    sn    driveway_0    garagepl    driveway_1    lotsize    price
    22    3    1    0    0    0    0    1    1    1    1    0    408    0    0    1    6420.0    87500.0
    11    1    1    0    0    1    0    1    1    0    0    0    53    0    2    1    9166.0    68000.0
    27    1    1    0    0    0    0    1    1    1    1    0    472    0    0    1    2787.0    60500.0
    10    1    1    0    1    0    0    0    1    1    0    1    260    0    0    1    6000.0    41000.0
    33    1    1    0    0    0    0    1    1    1    1    0    411    0    1    1    9000.0    90000.0
    25    1    1    0    1    0    0    0    1    1    0    1    249    0    0    1    3500.0    44500.0
    36    1    1    0    1    0    0    0    1    1    0    1    234    1    0    0    3970.0    32500.0
    69    1    1    0    1    0    0    0    1    1    0    0    353    0    2    1    7980.0    78500.0
    48    1    1    0    1    0    0    0    1    1    0    1    25    0    0    1    4960.0    42000.0
    14    1    1    0    0    0    0    1    1    1    0    1    251    0    2    1    3450.0    48500.0
    
    Updated dataset after performing scaling on Lasso selected features :
    prefarea_1    airco_0    homestyle_1    recroom_0    homestyle_0    fullbase_0    driveway_0    recroom_1    price    airco_1    id    fullbase_1    driveway_1    stories    bathrms    sn    garagepl    lotsize
    1    1    0    1    0    0    0    0    60500.0    0    27    1    1    -1.1754677333050345    -0.5026028286234502    1.221638612525457    -0.762853321915594    -1.1432517259761787
    0    1    1    1    0    0    0    0    48500.0    0    14    1    1    -1.1754677333050345    -0.5026028286234502    -0.1976858934347914    1.806757867694828    -0.7979668224456279
    1    1    0    1    0    0    0    0    90000.0    0    33    1    1    -1.1754677333050345    -0.5026028286234502    0.8298793597490989    0.521952272889617    2.0924271663847755
    0    1    1    1    0    1    0    0    44500.0    0    25    0    1    -1.1754677333050345    -0.5026028286234502    -0.21053045909959003    -0.762853321915594    -0.7719272369606693
    0    1    0    1    0    1    0    0    78500.0    0    69    0    1    -1.1754677333050345    -0.5026028286234502    0.4573869554699387    1.806757867694828    1.5612196224916204
    0    1    1    1    0    1    0    0    42000.0    0    48    0    1    -1.1754677333050345    -0.5026028286234502    -1.6491218135570365    -0.762853321915594    -0.011571340799878474
    0    0    0    1    0    1    0    0    67000.0    1    12    0    1    3.338328362586298    -0.5026028286234502    -1.5656321367358454    -0.762853321915594    0.09779491823694761
    0    1    0    1    0    1    0    0    80000.0    0    67    0    1    3.338328362586298    -0.5026028286234502    0.22618477350356336    -0.762853321915594    0.009260327588088398
    1    1    0    1    0    0    0    0    87500.0    0    22    1    1    1.8337296639558538    -0.5026028286234502    0.8106125112519009    -0.762853321915594    0.7487845553609124
    0    1    1    1    0    1    1    0    32500.0    0    36    0    0    -1.1754677333050345    -0.5026028286234502    -0.30686470158557977    -0.762853321915594    -0.5271551334020586
    
    Updated dataset after performing RFE feature selection:
    id    homestyle_0    homestyle_2    airco_1    bathrms    homestyle_1    sn    garagepl    lotsize    price
    27    0    1    0    1    0    472    0    2787.0    60500.0
    14    0    0    0    1    1    251    2    3450.0    48500.0
    33    0    1    0    1    0    411    1    9000.0    90000.0
    25    0    0    0    1    1    249    0    3500.0    44500.0
    69    0    1    0    1    0    353    2    7980.0    78500.0
    48    0    0    0    1    1    25    0    4960.0    42000.0
    12    0    1    1    1    0    38    0    5170.0    67000.0
    67    0    1    0    1    0    317    0    5000.0    80000.0
    22    0    1    0    1    0    408    0    6420.0    87500.0
    36    0    0    0    1    1    234    0    3970.0    32500.0
    
    Updated dataset after performing scaling on RFE selected features :
    r_airco_1    r_homestyle_0    price    r_homestyle_1    id    r_homestyle_2    r_bathrms    r_sn    r_garagepl    r_lotsize
    0    0    60500.0    0    27    1    -0.5026028286234502    1.221638612525457    -0.762853321915594    -1.1432517259761787
    0    0    48500.0    1    14    0    -0.5026028286234502    -0.1976858934347914    1.806757867694828    -0.7979668224456279
    0    0    90000.0    0    33    1    -0.5026028286234502    0.8298793597490989    0.521952272889617    2.0924271663847755
    0    0    44500.0    1    25    0    -0.5026028286234502    -0.21053045909959003    -0.762853321915594    -0.7719272369606693
    0    0    78500.0    0    69    1    -0.5026028286234502    0.4573869554699387    1.806757867694828    1.5612196224916204
    0    0    42000.0    1    48    0    -0.5026028286234502    -1.6491218135570365    -0.762853321915594    -0.011571340799878474
    1    0    67000.0    0    12    1    -0.5026028286234502    -1.5656321367358454    -0.762853321915594    0.09779491823694761
    0    0    80000.0    0    67    1    -0.5026028286234502    0.22618477350356336    -0.762853321915594    0.009260327588088398
    0    0    87500.0    0    22    1    -0.5026028286234502    0.8106125112519009    -0.762853321915594    0.7487845553609124
    0    0    32500.0    1    36    0    -0.5026028286234502    -0.30686470158557977    -0.762853321915594    -0.5271551334020586
    
    Updated dataset after performing scaling for PCA feature selection :
    airco_0    prefarea_1    homestyle_1    recroom_0    homestyle_0    prefarea_0    fullbase_0    gashw_0    driveway_0    gashw_1    homestyle_2    price    airco_1    bedrooms    id    fullbase_1    recroom_1    driveway_1    sn    lotsize    bathrms    stories    garagepl
    1    1    0    1    0    0    0    1    0    0    1    60500.0    0    3    27    1    0    1    1.221638612525456    -1.1432517259761805    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    1    0    1    1    0    1    0    1    0    0    0    48500.0    0    3    14    1    0    1    -0.19768589343479123    -0.797966822445629    -0.5026028286234494    -1.175467733305031    1.8067578676948284
    1    1    0    1    0    0    0    1    0    0    1    90000.0    0    3    33    1    0    1    0.8298793597490982    2.0924271663847787    -0.5026028286234494    -1.175467733305031    0.5219522728896171
    1    0    1    1    0    1    1    1    0    0    0    44500.0    0    2    25    0    0    1    -0.21053045909958984    -0.7719272369606704    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    1    0    0    1    0    1    1    1    0    0    1    78500.0    0    3    69    0    0    1    0.45738695546993824    1.5612196224916226    -0.5026028286234494    -1.175467733305031    1.8067578676948284
    1    0    1    1    0    1    1    1    0    0    0    42000.0    0    2    48    0    0    1    -1.649121813557035    -0.01157134079987849    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    0    0    0    1    0    1    1    1    0    0    1    67000.0    1    3    12    0    0    1    -1.565632136735844    0.09779491823694775    -0.5026028286234494    3.338328362586288    -0.7628533219155942
    1    0    0    1    0    1    1    1    0    0    1    80000.0    0    3    67    0    0    1    0.22618477350356314    0.009260327588088412    -0.5026028286234494    3.338328362586288    -0.7628533219155942
    1    1    0    1    0    0    0    1    0    0    1    87500.0    0    3    22    1    0    1    0.8106125112519003    0.7487845553609134    -0.5026028286234494    1.8337296639558482    -0.7628533219155942
    1    0    1    1    0    1    1    1    1    0    0    32500.0    0    1    36    0    0    0    -0.3068647015855795    -0.5271551334020593    -0.5026028286234494    -1.175467733305031    -0.7628533219155942
    
    Updated dataset after performing PCA feature selection :
    id    col_0    col_1    col_2    col_3    col_4    col_5    col_6    col_7    col_8    col_9    price
    0    24    0.121058    -0.088188    1.823276    -1.499059    0.807124    -0.001108    -0.486189    -0.327831    0.999693    0.041307    64900.0
    1    12    -1.702851    2.244144    -0.163064    1.531449    -1.726684    1.435539    -0.336577    0.312267    -0.721428    -0.019429    67000.0
    2    22    0.436066    1.674569    -1.184835    -0.517098    -0.149681    0.581587    -1.015373    0.346597    -1.125296    -0.054246    87500.0
    3    11    1.663402    -2.177491    0.968586    1.008283    -1.421494    1.167298    0.057840    -0.145552    -0.896997    -0.324942    68000.0
    4    67    -1.071889    2.634593    -1.143026    1.133870    -0.666936    0.448109    -1.183424    0.156071    -0.190941    -0.026246    80000.0
    5    27    0.097740    -0.363043    -1.094884    -1.806548    1.420171    0.035027    0.188841    -0.662482    -0.059580    0.003944    60500.0
    6    10    -0.433849    -1.132748    -0.239012    -0.552218    -0.744052    -1.251285    -0.337724    -0.066861    0.077437    -0.506742    41000.0
    7    14    0.086457    -1.901373    0.016338    0.671400    1.402064    0.090815    0.377295    0.154887    0.009292    -1.147022    48500.0
    8    33    2.319747    -1.098141    -0.591489    -0.754201    -0.320262    -0.143066    -0.728844    -0.041061    -0.982347    0.087009    90000.0
    9    25    -1.212708    -1.063731    -0.313326    -0.550789    0.092915    -1.034314    0.082146    -0.217252    0.379940    -0.544570    44500.0
    Data Transformation completed.
    xgboost rfe
    
     Prediction : 
       id    Prediction  Confidence_Lower  Confidence_upper    price
    0  11  63407.696254      -6286.857864     133102.250372  68000.0
    1  10  45725.644955      -8077.754670      99529.044579  41000.0
    2  14  48264.446474      -8178.987284     104707.880232  48500.0
    3  33  82824.661595     -10706.752376     176356.075566  90000.0
    4  36  39391.406793     -12523.489331      91306.302918  32500.0
    5  69  84922.363886     -10489.030654     180333.758425  78500.0
    6  48  53357.925007      -4457.250673     111173.100687  42000.0
    7  12  62819.110370      -9364.622595     135002.843336  67000.0
    8  67  77873.066512     -15576.831695     171322.964720  80000.0
    9  25  44749.546512      -8874.392204      98373.485228  44500.0
    
     Performance Metrics : 
               MAE           MSE      MSLE       MAPE       MPE         RMSE     RMSLE            ME        R2        EV          MPD       MGD
    0  6879.146812  8.221054e+07  0.032565  13.914751 -5.928335  9067.002694  0.180459  24958.923668  0.744677  0.748762  1485.766726  0.029392
    >>> prediction.head()
    
    id    Prediction    Confidence_Lower    Confidence_upper    price
    10    45725.64495450001    -8077.754669657297    99529.04457865731    41000.0
    12    62819.1103705        -9364.622594834902    135002.8433358349    67000.0
    13    39690.2391295        -9284.312483377085    88664.79074237708    49000.0
    14    48264.446474000004   -8178.987283911469    104707.88023191148   48500.0
    16    77665.88047149999     -15211.0934159397    170542.8543589397    72000.0
    17    45658.099697000005   -5607.346779750362    96923.54617375036    27000.0
    15    61603.1342455       -11097.472869949066    134303.74136094906   61000.0
    11    63407.69625400001    -6286.857864333986    133102.250372334     68000.0
    9     62013.533379          -10976.9802656299    135004.0470236299    55000.0
    8     54701.9355845         -8252.91542189393    117656.78659089393   55000.0