AutoML for classification using early stopping timer, max_models, and customization - Example 4: Run AutoML for classification problem using early stopping timer, max_models and customization - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2026-01-07
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

This example predicts whether passenger aboard the RMS Titanic survived or not based on different factors.

Run AutoML to get the best performing model using following specifications:
  • Add customization for some specific process in AutoML run.
  • Use only two models 'xgboost' and ‘decision forest’ for AutoML training.
  • Set early stopping timer to 100 sec and max_models to 5.
  • Opt for verbose level 2 to get detailed log.
  1. Load data and split it to train and test datasets.
    1. Load the example data and create teradataml DataFrame.
      >>> load_example_data("teradataml", "titanic")
      >>> titanic = DataFrame.from_table("titanic")
    2. Perform sampling to get 80% for training and 20% for testing.
      >>> titanic_sample = titanic.sample(frac = [0.8, 0.2])
    3. Fetch train and test data.
      >>> titanic_train= titanic_sample[titanic_sample['sampleid'] == 1].drop('sampleid', axis=1)
      >>> titanic_test = titanic_sample[titanic_sample['sampleid'] == 2].drop('sampleid', axis=1)
  2. Add customization and generate custom config JSON file.
    >>> AutoML.generate_custom_config("custom_titanic")
    Generating custom config JSON for AutoML ...
    
    Available main options for customization with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Feature Engineering Phase
    
    Index 2: Customize Data Preparation Phase
    
    Index 3: Customize Model Training Phase
    
    Index 4: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the index you want to customize:  1
    
    Customizing Feature Engineering Phase ...
    
    Available options for customization of feature engineering phase with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Missing Value Handling
    
    Index 2: Customize Bincode Encoding
    
    Index 3: Customize String Manipulation
    
    Index 4: Customize Categorical Encoding
    
    Index 5: Customize Mathematical Transformation
    
    Index 6: Customize Nonlinear Transformation
    
    Index 7: Customize Antiselect Features
    
    Index 8: Back to main menu
    
    Index 9: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the list of indices you want to customize in feature engineering phase:  1,2,4,6,7,8
    
    Customizing Missing Value Handling ...
    
    Provide the following details to customize missing value handling:
    
    Available missing value handling methods with corresponding indices: 
    Index 1: Drop Columns
    Index 2: Drop Rows
    Index 3: Impute Missing values
    
    Enter the list of indices for missing value handling methods :  1,3
    
    Enter the feature or list of features for dropping columns with missing values:  cabin
    
    Available missing value imputation methods with corresponding indices: 
    Index 1: Statistical Imputation
    Index 2: Literal Imputation
    
    Enter the list of corresponding index missing value imputation methods you want to use:  1
    
    Enter the feature or list of features for imputing missing values using statistic values:  age
    
    Available statistical methods with corresponding indices:
    Index 1: min
    Index 2: max
    Index 3: mean
    Index 4: median
    Index 5: mode
    
    Enter the index of corresponding statistic imputation method for feature age:  4
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of missing value handling has been completed successfully.
    
    Customizing Bincode Encoding ...
    
    Provide the following details to customize binning and coding encoding:
    
    Available binning methods with corresponding indices:
    Index 1: Equal-Width
    Index 2: Variable-Width
    
    Enter the feature or list of features for binning:  pclass
    
    Enter the index of corresponding binning method for feature pclass:  1
    
    Enter the number of bins for feature pclass:  3
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of bincode encoding has been completed successfully.
    
    Customizing Categorical Encoding ...
    
    Provide the following details to customize categorical encoding:
    
    Available categorical encoding methods with corresponding indices:
    Index 1: OneHotEncoding
    Index 2: OrdinalEncoding
    Index 3: TargetEncoding
    
    Enter the list of corresponding index categorical encoding methods you want to use:  2,3
    
    Enter the feature or list of features for OrdinalEncoding:  pclass
    
    Enter the feature or list of features for TargetEncoding:  embarked
    
    Available target encoding methods with corresponding indices:
    Index 1: CBM_BETA
    Index 2: CBM_DIRICHLET
    Index 3: CBM_GAUSSIAN_INVERSE_GAMMA
    
    Enter the index of target encoding method for feature embarked:  3
    
    Enter the response column for target encoding method for feature embarked:  survived
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of categorical encoding has been completed successfully.
    
    Customizing Nonlinear Transformation ...
    
    Provide the following details to customize nonlinear transformation:
    
    Enter number of non-linear combination you want to make:  1
    
    Provide the details for non-linear combination 1:
    
    Enter the list of target feature/s for non-linear combination 1:  parch,sibsp
    
    Enter the formula for non-linear combination 1:  Y=(X0+X1+1)
    
    Enter the resultant feature for non-linear combination 1:  family_count
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of nonlinear transformation has been completed successfully.
    
    Customizing Antiselect Features ...
    
    Enter the feature or list of features for antiselect:  passenger
    
    Available options for generic arguments: 
    Index 0: Default
    Index 1: volatile
    Index 2: persist
    
    Enter the indices for generic arguments :  0
    
    Customization of antiselect features has been completed successfully.
    
    Customization of feature engineering phase has been completed successfully.
    
    Available main options for customization with corresponding indices: 
    --------------------------------------------------------------------------------
    
    Index 1: Customize Feature Engineering Phase
    
    Index 2: Customize Data Preparation Phase
    
    Index 3: Customize Model Training Phase
    
    Index 4: Generate custom json and exit
    --------------------------------------------------------------------------------
    
    Enter the index you want to customize:  4
    
    Generating custom json and exiting ...
    
    Process of generating custom config file for AutoML has been completed successfully.
    
    'custom_titanic.json' file is generated successfully under the current working directory.
  3. Create an AutoML instance.
    >>> aml = AutoML(task_type="Classification",
    >>>              include=['decision_forest','xgboost'],
    >>>              verbose=2,
    >>>              max_runtime_secs=100,
    >>>              max_models=5,
    >>>              custom_config_file='custom_titanic.json')
  4. Fit the data.
    >>> aml.fit(titanic_train, titanic_train.survived)
    2025-11-04 01:58:05,914 | INFO     | Received below input for customization :
    {
        "MissingValueHandlingIndicator": true,
        "MissingValueHandlingParam": {
            "DroppingColumnIndicator": true,
            "DroppingColumnList": [
                "cabin"
            ],
            "ImputeMissingIndicator": true,
            "StatImputeList": [
                "age"
            ],
            "StatImputeMethod": [
                "median"
            ]
        },
        "BincodeIndicator": true,
        "BincodeParam": {
            "pclass": {
                "Type": "Equal-Width",
                "NumOfBins": 3
            }
        },
        "CategoricalEncodingIndicator": true,
        "CategoricalEncodingParam": {
            "OrdinalEncodingIndicator": true,
            "OrdinalEncodingList": [
                "pclass"
            ],
            "TargetEncodingIndicator": true,
            "TargetEncodingList": {
                "embarked": {
                    "encoder_method": "CBM_GAUSSIAN_INVERSE_GAMMA",
                    "response_column": "survived"
                }
            }
        },
        "NonLinearTransformationIndicator": true,
        "NonLinearTransformationParam": {
            "Combination_1": {
                "target_columns": [
                    "parch",
                    "sibsp"
                ],
                "formula": "Y=(X0+X1+1)",
                "result_column": "family_count"
            }
        },
        "AntiselectIndicator": true,
        "AntiselectParam": {
            "excluded_columns": [
                "passenger"
            ]
        }
    }
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:58:05,915 | INFO     | Feature Exploration started
    2025-11-04 01:58:05,915 | INFO     | Data Overview:
    2025-11-04 01:58:05,956 | INFO     | Total Rows in the data: 713
    2025-11-04 01:58:05,998 | INFO     | Total Columns in the data: 12
    2025-11-04 01:58:06,608 | INFO     | Column Summary:
       ColumnName                           Datatype  NonNullCount  NullCount  BlankCount  ZeroCount  PositiveCount  NegativeCount  NullPercentage  NonNullPercentage
    0        name  VARCHAR(1000) CHARACTER SET LATIN           713          0         0.0        NaN            NaN            NaN        0.000000         100.000000
    1      ticket    VARCHAR(20) CHARACTER SET LATIN           713          0         0.0        NaN            NaN            NaN        0.000000         100.000000
    2   passenger                            INTEGER           713          0         NaN        0.0          713.0            0.0        0.000000         100.000000
    3       sibsp                            INTEGER           713          0         NaN      486.0          227.0            0.0        0.000000         100.000000
    4       parch                            INTEGER           713          0         NaN      540.0          173.0            0.0        0.000000         100.000000
    5        fare                              FLOAT           713          0         NaN       10.0          703.0            0.0        0.000000         100.000000
    6       cabin    VARCHAR(20) CHARACTER SET LATIN           154        559         0.0        NaN            NaN            NaN       78.401122          21.598878
    7    embarked    VARCHAR(20) CHARACTER SET LATIN           711          2         0.0        NaN            NaN            NaN        0.280505          99.719495
    8         sex    VARCHAR(20) CHARACTER SET LATIN           713          0         0.0        NaN            NaN            NaN        0.000000         100.000000
    9         age                            INTEGER           573        140         NaN        5.0          568.0            0.0       19.635344          80.364656
    10     pclass                            INTEGER           713          0         NaN        0.0          713.0            0.0        0.000000         100.000000
    11   survived                            INTEGER           713          0         NaN      445.0          268.0            0.0        0.000000         100.000000
    2025-11-04 01:58:07,380 | INFO     | Statistics of Data:
      ATTRIBUTE            StatName   StatValue
    0  survived             MAXIMUM    1.000000
    1  survived  STANDARD DEVIATION    0.484688
    2  survived     PERCENTILES(25)    0.000000
    3  survived     PERCENTILES(50)    0.000000
    4      fare               COUNT  713.000000
    5      fare             MINIMUM    0.000000
    6      fare             MAXIMUM  512.329200
    7      fare                MEAN   32.204125
    8      fare  STANDARD DEVIATION   51.384597
    9      fare     PERCENTILES(25)    7.925000
    2025-11-04 01:58:07,548 | INFO     | Categorical Columns with their Distinct values:
    ColumnName                DistinctValueCount
    name                      713
    sex                       2
    ticket                    565
    cabin                     124
    embarked                  3
    2025-11-04 01:58:10,125 | INFO     | Futile columns in dataset:
      ColumnName
    0       name
    1     ticket
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 01:58:13,554 | INFO     | Columns with outlier percentage :-
      ColumnName  OutlierPercentage
    0       fare          12.762973
    1      parch          24.263675
    2      sibsp           5.329593
    3        age          20.476858
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:58:13,902 | INFO     | Feature Engineering started ...
    2025-11-04 01:58:13,902 | INFO     | Handling duplicate records present in dataset ...
    2025-11-04 01:58:14,041 | INFO     | Analysis completed. No action taken.
    2025-11-04 01:58:14,041 | INFO     | Total time to handle duplicate records: 0.14 sec
    2025-11-04 01:58:14,041 | INFO     | Starting customized anti-select columns ...
    2025-11-04 01:58:14,629 | INFO     | Updated dataset sample after performing anti-select columns:
       survived  pclass                                                name     sex   age  sibsp  parch             ticket     fare cabin embarked
    0         0       3                               Dantcheff, Mr. Ristiu    male  25.0      0      0             349203   7.8958  None        S
    1         1       2  Watt, Mrs. James (Elizabeth "Bessie" Inglis Milne)  female  40.0      0      0         C.A. 33595  15.7500  None        S
    2         1       1                Barkworth, Mr. Algernon Henry Wilson    male  80.0      0      0              27042  30.0000   A23        S
    3         0       2                         Hocking, Mr. Richard George    male  23.0      2      1              29104  11.5000  None        S
    4         1       3                                   Jonsson, Mr. Carl    male  32.0      0      0             350417   7.8542  None        S
    5         0       3                          Moore, Mr. Leonard Charles    male   NaN      0      0          A4. 54510   8.0500  None        S
    6         0       3                                Rintamaki, Mr. Matti    male  35.0      0      0  STON/O 2. 3101273   7.1250  None        S
    7         0       3                     Goodwin, Master. Sidney Leonard    male   1.0      5      2            CA 2144  46.9000  None        S
    8         0       3                   Williams, Mr. Howard Hugh "Harry"    male   NaN      0      0           A/5 2466   8.0500  None        S
    9         1       3                         Nicola-Yarred, Miss. Jamila  female  14.0      1      0               2651  11.2417  None        C
    713 rows X 11 columns
    2025-11-04 01:58:14,964 | INFO     | Handling less significant features from data ...
    2025-11-04 01:58:19,867 | INFO     | Removing Futile columns:
    ['ticket', 'name']
    2025-11-04 01:58:19,867 | INFO     | Sample of Data after removing Futile columns:
       survived  pclass     sex   age  sibsp  parch      fare cabin embarked  automl_id
    0         1       1    male  80.0      0      0   30.0000   A23        S         14
    1         1       1  female  36.0      0      0  135.6333   C32        C          8
    2         0       3    male  25.0      0      0    7.8958  None        S         12
    3         0       3    male   NaN      0      0    8.0500  None        S          7
    4         0       3    male   1.0      5      2   46.9000  None        S         15
    5         0       2    male  23.0      2      1   11.5000  None        S          5
    6         0       3    male   NaN      0      0    8.0500  None        S          9
    7         1       3    male  32.0      0      0    7.8542  None        S         13
    8         0       3    male  35.0      0      0    7.1250  None        S         11
    9         0       3    male   NaN      0      0    7.7250  None        Q          4
    713 rows X 10 columns
    2025-11-04 01:58:20,323 | INFO     | Total time to handle less significant features: 5.36 sec
    2025-11-04 01:58:20,323 | INFO     | Handling Date Features ...
    2025-11-04 01:58:20,323 | INFO     | Analysis Completed. Dataset does not contain any feature related to dates. No action needed.
    2025-11-04 01:58:20,323 | INFO     | Total time to handle date features: 0.00 sec
    2025-11-04 01:58:20,324 | INFO     | Dropping these columns for handling customized missing value:
    ['cabin']
    2025-11-04 01:58:23,129 | INFO     | Updated dataset sample after performing customized missing value imputation:
              pclass     sex  age  sibsp  parch      fare embarked  automl_id
    survived
    1              2  female   19      1      0   26.0000        S         88
    1              1  female   33      1      0   53.1000        S        116
    1              1  female   35      0      0  512.3292        C        120
    1              1  female   44      0      1   57.9792        C        124
    1              1  female   28      1      0   89.1042        C        136
    1              1    male   48      1      0   76.7292        C        140
    0              3    male    1      5      2   46.9000        S         15
    0              1    male   52      1      1   79.6500        S         35
    0              3    male   28      0      0    7.8958        S         47
    0              3    male   19      0      0    0.0000        S         59
    713 rows X 9 columns
    2025-11-04 01:58:23,242 | INFO     | Proceeding with default option for handling remaining missing values.
    2025-11-04 01:58:23,242 | INFO     | Checking Missing values in dataset ...
    2025-11-04 01:58:24,028 | INFO     | Columns with their missing values:
    embarked: 2
    2025-11-04 01:58:24,432 | INFO     | Deleting rows of these columns for handling missing values:
    ['embarked']
    2025-11-04 01:58:24,516 | INFO     | Sample of dataset after removing 2 rows:
              pclass     sex  age  sibsp  parch      fare embarked  automl_id
    survived
    1              2  female   19      1      0   26.0000        S         88
    1              1  female   33      1      0   53.1000        S        116
    1              1  female   35      0      0  512.3292        C        120
    1              1  female   44      0      1   57.9792        C        124
    1              1  female   28      1      0   89.1042        C        136
    1              1    male   48      1      0   76.7292        C        140
    0              3    male    1      5      2   46.9000        S         15
    0              1    male   52      1      1   79.6500        S         35
    0              3    male   28      0      0    7.8958        S         47
    0              3    male   19      0      0    0.0000        S         59
    711 rows X 9 columns
    2025-11-04 01:58:24,606 | INFO     | Total time to find missing values in data: 1.36 sec
    2025-11-04 01:58:24,606 | INFO     | Imputing Missing Values ...
    2025-11-04 01:58:24,606 | INFO     | Analysis completed. No imputation required.
    2025-11-04 01:58:24,606 | INFO     | Time taken to perform imputation: 0.00 sec
    2025-11-04 01:58:25,987 | INFO     | Updated dataset sample after performing Equal-Width binning :-
              age  parch      fare embarked  automl_id     sex  sibsp    pclass
    survived
    0          18      1   20.2125        S         75    male      1  pclass_3
    0          71      0   34.6542        C         91    male      0  pclass_1
    0          24      0   24.1500        S         95    male      2  pclass_3
    0          29      0   27.7208        C         99    male      1  pclass_2
    0          26      0    8.0500        S        115    male      0  pclass_3
    0          26      0    7.8958        S        119    male      0  pclass_3
    1          19      0   26.0000        S         88  female      1  pclass_2
    1          33      0   53.1000        S        116  female      1  pclass_1
    1          35      0  512.3292        C        120  female      0  pclass_1
    1          44      1   57.9792        C        124  female      0  pclass_1
    711 rows X 9 columns
    2025-11-04 01:58:26,099 | INFO     | No information provided for Variable-Width Transformation.
    2025-11-04 01:58:26,099 | INFO     | Skipping customized string manipulation.
    2025-11-04 01:58:26,099 | INFO     | Starting Customized Categorical Feature Encoding ...
    2025-11-04 01:58:28,132 | INFO     | Updated dataset sample after performing ordinal encoding:
              age  parch      fare embarked  automl_id     sex  sibsp  pclass
    survived
    1          48      0   76.7292        C        140    male      1       0
    1          19      0   30.0000        S        184  female      0       0
    1          36      2   71.0000        S        192  female      0       0
    1          27      0   13.8583        C        200  female      1       1
    1          14      2  120.0000        S        248  female      1       0
    1          18      0  227.5250        C        252  female      1       0
    0          18      1   20.2125        S         75    male      1       2
    0          71      0   34.6542        C         91    male      0       0
    0          24      0   24.1500        S         95    male      2       2
    0          29      0   27.7208        C         99    male      1       1
    711 rows X 9 columns
    2025-11-04 01:58:31,099 | INFO     | Updated dataset sample after performing target encoding:
              survived  age  parch      fare  pclass  automl_id     sex  sibsp
    embarked
    0.533835         1   17      0  108.9000       0        340  female      1
    0.533835         1   23      1   63.3583       0        452    male      0
    0.533835         1   29      0    7.8958       2        604    male      0
    0.533835         1   49      0   89.1042       0        624    male      1
    0.533835         1   40      1  134.5000       0        105  female      1
    0.533835         1   26      0   30.0000       0        125    male      0
    0.533835         1   28      0   24.0000       1         37  female      1
    0.533835         1   60      0   75.2500       0        368  female      1
    0.533835         1   18      0  227.5250       0        252  female      1
    0.533835         1   48      0   76.7292       0        140    male      1
    711 rows X 9 columns
    2025-11-04 01:58:31,215 | INFO     | Performing encoding for categorical columns ...
    2025-11-04 01:58:33,242 | INFO     | ONE HOT Encoding these Columns:
    ['sex']
    2025-11-04 01:58:33,242 | INFO     | Sample of dataset after performing one hot encoding:
              survived  age  parch     fare  pclass  automl_id  sex_0  sex_1  sibsp
    embarked
    0.4              1   28      0   7.7500       2        653      1      0      0
    0.4              1   28      0  15.5000       2        346      1      0      1
    0.4              1   28      0   7.7500       2        546      1      0      0
    0.4              1   28      0   7.7333       2        650      1      0      0
    0.4              1   19      0   7.8792       2        295      1      0      0
    0.4              1   28      0  23.2500       2        419      0      1      2
    0.4              1   16      0   7.7500       2        279      1      0      0
    0.4              1   28      0   7.8792       2        106      1      0      0
    0.4              1   28      0   7.7500       2        565      1      0      0
    0.4              1   28      0   7.7875       2        329      1      0      0
    711 rows X 10 columns
    2025-11-04 01:58:33,372 | INFO     | Time taken to encode the columns: 2.16 sec
    2025-11-04 01:58:33,372 | INFO     | Starting customized mathematical transformation ...
    2025-11-04 01:58:33,372 | INFO     | Skipping customized mathematical transformation.
    2025-11-04 01:58:33,372 | INFO     | Starting customized non-linear transformation ...
    2025-11-04 01:58:33,372 | INFO     | Possible combination :
    ['Combination_1']
    2025-11-04 01:58:35,367 | INFO     | Updated dataset sample after performing non-liner transformation:
              survived  age  parch     fare  pclass  automl_id  sex_0  sex_1  sibsp  family_count
    embarked
    0.327519         1   28    0.0  16.1000       2        352      1      0    1.0           2.0
    0.327519         1   35    0.0  52.0000       0        440      1      0    1.0           2.0
    0.327519         1   28    0.0  35.5000       0        476      0      1    0.0           1.0
    0.327519         1   39    0.0   7.9250       2        500      0      1    0.0           1.0
    0.327519         1   35    0.0  26.2875       0        568      0      1    0.0           1.0
    0.327519         1   41    1.0  19.5000       1        588      1      0    0.0           2.0
    0.400000         1   28    0.0   7.7500       2        653      1      0    0.0           1.0
    0.400000         1   28    0.0  15.5000       2        346      1      0    1.0           2.0
    0.400000         1   28    0.0   7.7500       2        546      1      0    0.0           1.0
    0.400000         1   28    0.0   7.7333       2        650      1      0    0.0           1.0
    711 rows X 11 columns
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:58:35,481 | INFO     | Data preparation started ...
    2025-11-04 01:58:35,481 | INFO     | No information provided for performing customized feature scaling. Proceeding with default option.
    2025-11-04 01:58:35,481 | INFO     | No information provided for performing customized imbalanced dataset sampling. AutoML will Proceed with default option.
    2025-11-04 01:58:35,481 | INFO     | Starting customized outlier processing ...
    2025-11-04 01:58:35,481 | INFO     | No information provided for customized outlier processing. AutoML will proceed with default settings.
    2025-11-04 01:58:35,481 | INFO     | Outlier preprocessing ...
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           2025-11-04 01:58:38,454 | INFO     | Columns with outlier percentage :-
         ColumnName  OutlierPercentage
    0          fare          12.517581
    1      embarked          18.565401
    2  family_count          10.829817
    3         sibsp           5.344585
    4         parch          24.331927
    5           age           7.172996
    2025-11-04 01:58:38,844 | INFO     | Deleting rows of these columns:
    ['age', 'sibsp']
    2025-11-04 01:58:41,003 | INFO     | Sample of dataset after removing outlier rows:
              survived  age  parch     fare  pclass  automl_id  sex_0  sex_1  sibsp  family_count
    embarked
    0.327519         1   28    0.0  35.5000       0        476      0      1    0.0           1.0
    0.327519         1   35    0.0  26.2875       0        568      0      1    0.0           1.0
    0.327519         1   41    1.0  19.5000       1        588      1      0    0.0           2.0
    0.327519         1   24    2.0  16.7000       2        600      1      0    0.0           3.0
    0.327519         1   24    0.0   7.1417       2        620      0      1    0.0           1.0
    0.327519         1   24    2.0  65.0000       1        652      1      0    1.0           4.0
    0.327519         1   39    0.0  55.9000       0        612      1      0    1.0           2.0
    0.327519         1   39    0.0   7.9250       2        500      0      1    0.0           1.0
    0.327519         1   35    0.0  52.0000       0        440      1      0    1.0           2.0
    0.327519         1   28    0.0  16.1000       2        352      1      0    1.0           2.0
    629 rows X 11 columns
    2025-11-04 01:58:41,138 | INFO     | median inplace of outliers:
    ['embarked', 'parch', 'family_count', 'fare']
    2025-11-04 01:58:43,191 | INFO     | Sample of dataset after performing MEDIAN inplace:
              survived  age  parch     fare  pclass  automl_id  sex_0  sex_1  sibsp  family_count
    embarked
    0.327519         1   26    0.0  18.7875       2        625      0      1    0.0           1.0
    0.327519         1   13    0.0   7.2292       2        258      1      0    0.0           1.0
    0.327519         1   30    0.0  56.9292       0        294      1      0    0.0           1.0
    0.327519         1   39    0.0  13.0000       0        382      1      0    1.0           3.0
    0.327519         1   41    0.0  13.0000       0        522      1      0    0.0           1.0
    0.327519         1   22    0.0  49.5000       0        550      1      0    0.0           3.0
    0.400000         1   28    0.0  23.2500       2        419      0      1    2.0           3.0
    0.400000         1   29    0.0   7.7500       2        724      0      1    0.0           1.0
    0.400000         1   28    0.0   7.7500       2        148      0      1    0.0           1.0
    0.400000         1   28    0.0  15.5000       2         48      1      0    1.0           2.0
    629 rows X 11 columns
    2025-11-04 01:58:43,306 | INFO     | Time Taken by Outlier processing: 7.82 sec
    2025-11-04 01:58:43,306 | INFO     | Checking imbalance data ...
    2025-11-04 01:58:43,369 | INFO     | Imbalance Not Found.
    2025-11-04 01:58:44,144 | INFO     | Feature selection using rfe ...
    2025-11-04 01:58:59,868 | INFO     | feature selected by RFE:
    ['age', 'sex_1', 'pclass', 'sex_0', 'fare', 'family_count']
    2025-11-04 01:58:59,870 | INFO     | Total time taken by feature selection: 15.73 sec
    2025-11-04 01:59:00,180 | INFO     | Scaling Features of rfe data ...
    2025-11-04 01:59:01,334 | INFO     | columns that will be scaled:
    ['r_age', 'r_pclass', 'r_fare', 'r_family_count']
    2025-11-04 01:59:03,143 | INFO     | Dataset sample after scaling:
       survived  r_sex_0  r_sex_1  automl_id     r_age  r_pclass    r_fare  r_family_count
    0         1        1        0          6  0.215686       1.0  0.197223             0.5
    1         1        1        0          8  0.647059       0.0  0.228070             0.0
    2         0        0        1          9  0.490196       1.0  0.141228             0.0
    3         1        1        0         10  0.725490       0.5  0.276316             0.0
    4         0        0        1         12  0.431373       1.0  0.138523             0.0
    5         1        0        1         13  0.568627       1.0  0.137793             0.0
    6         0        0        1         11  0.627451       1.0  0.125000             0.0
    7         0        0        1          7  0.490196       1.0  0.141228             0.0
    8         0        0        1          5  0.392157       0.5  0.201754             0.0
    9         0        0        1          4  0.490196       1.0  0.135526             0.0
    629 rows X 8 columns
    2025-11-04 01:59:03,693 | INFO     | Total time taken by feature scaling: 3.51 sec
    2025-11-04 01:59:03,693 | INFO     | Scaling Features of pca data ...
    2025-11-04 01:59:04,612 | INFO     | columns that will be scaled:
    ['embarked', 'age', 'fare', 'pclass', 'sibsp', 'family_count']
    2025-11-04 01:59:06,517 | INFO     | Dataset sample after scaling:
       survived  parch  sex_1  sex_0  automl_id  embarked       age      fare  pclass  sibsp  family_count
    0         1    0.0      1      0        148       1.0  0.490196  0.135965     1.0    0.0           0.0
    1         0    0.0      1      0        255       1.0  0.490196  0.271930     1.0    0.5           0.5
    2         0    0.0      1      0        343       1.0  0.490196  0.135965     1.0    0.0           0.0
    3         0    0.0      1      0        355       1.0  0.313725  0.118421     1.0    0.0           0.0
    4         0    0.0      0      1         98       1.0  0.490196  0.133846     1.0    0.0           0.0
    5         0    0.0      0      1        190       1.0  0.705882  0.510965     1.0    0.0           0.0
    6         1    0.0      0      1        361       0.0  0.960784  0.228070     0.0    0.5           0.5
    7         1    0.0      0      1        545       0.0  0.490196  0.228070     0.0    0.5           0.5
    8         1    0.0      1      0        625       0.0  0.450980  0.329605     1.0    0.0           0.0
    9         1    0.0      0      1          6       0.0  0.215686  0.197223     1.0    0.5           0.5
    629 rows X 11 columns
    2025-11-04 01:59:07,065 | INFO     | Total time taken by feature scaling: 3.37 sec
    2025-11-04 01:59:07,065 | INFO     | Dimension Reduction using pca ...
    2025-11-04 01:59:07,692 | INFO     | PCA columns:
    ['col_0', 'col_1', 'col_2', 'col_3', 'col_4', 'col_5']
    2025-11-04 01:59:07,692 | INFO     | Total time taken by PCA: 0.63 sec
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    2025-11-04 01:59:08,078 | INFO     | Model Training started ...
    2025-11-04 01:59:08,141 | INFO     | Starting customized hyperparameter update ...
    2025-11-04 01:59:08,141 | INFO     | Skipping customized hyperparameter tuning
    2025-11-04 01:59:08,144 | INFO     | Hyperparameters used for model training:
    2025-11-04 01:59:08,144 | INFO     | Model: decision_forest
    2025-11-04 01:59:08,144 | INFO     | Hyperparameters: {'response_column': 'survived', 'name': 'decision_forest', 'tree_type': 'Classification', 'min_impurity': (0.0, 0.1, 0.2), 'max_depth': (5, 6, 8, 10), 'min_node_size': (1, 2, 3), 'num_trees': (-1,), 'seed': 42}
    2025-11-04 01:59:08,145 | INFO     | Total number of models for decision_forest: 36
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    2025-11-04 01:59:08,145 | INFO     | Model: xgboost
    2025-11-04 01:59:08,145 | INFO     | Hyperparameters: {'response_column': 'survived', 'name': 'xgboost', 'model_type': 'Classification', 'column_sampling': (1, 0.6), 'min_impurity': (0.0, 0.1, 0.2), 'lambda1': (1.0, 0.01, 0.1), 'shrinkage_factor': (0.5, 0.1, 0.3), 'max_depth': (5, 6, 8, 10), 'min_node_size': (1, 2, 3), 'iter_num': (10, 20, 30), 'num_boosted_trees': (-1, 5, 10), 'seed': 42}
    2025-11-04 01:59:08,146 | INFO     | Total number of models for xgboost: 5832
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    2025-11-04 01:59:08,146 | INFO     | Performing hyperparameter tuning ...
                                                                                                                                                                 2025-11-04 01:59:09,386 | INFO     | Model training for decision_forest
    2025-11-04 01:59:20,913 | INFO     | ----------------------------------------------------------------------------------------------------
                                                                                                                                                                 2025-11-04 01:59:20,913 | INFO     | Model training for xgboost
    2025-11-04 01:59:27,012 | INFO     | ----------------------------------------------------------------------------------------------------
    2025-11-04 01:59:27,015 | INFO     | Leaderboard
       RANK          MODEL_ID FEATURE_SELECTION  ACCURACY  MICRO-PRECISION  ...  MACRO-RECALL  MACRO-F1  WEIGHTED-PRECISION  WEIGHTED-RECALL  WEIGHTED-F1
    0     1  DECISIONFOREST_0               rfe  0.833333         0.833333  ...      0.822820  0.824011            0.832799         0.833333     0.833012
    1     2  DECISIONFOREST_1               rfe  0.809524         0.809524  ...      0.799629  0.799629            0.809524         0.809524     0.809524
    2     3  DECISIONFOREST_3               pca  0.809524         0.809524  ...      0.769944  0.783070            0.821440         0.809524     0.799904
    3     4  DECISIONFOREST_2               pca  0.785714         0.785714  ...      0.754174  0.763009            0.786184         0.785714     0.779310
    4     5         XGBOOST_1               pca  0.761905         0.761905  ...      0.764378  0.755751            0.772947         0.761905     0.764366
    [5 rows x 13 columns]
    5 rows X 13 columns
    1. Feature Exploration -> 2. Feature Engineering -> 3. Data Preparation -> 4. Model Training & Evaluation
    >>> Completed: |⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿⫿| 100% - 18/18
  5. Display model leaderboard.
    >>> aml.leaderboard()
       RANK          MODEL_ID FEATURE_SELECTION  ACCURACY  MICRO-PRECISION  ...  MACRO-RECALL  MACRO-F1  WEIGHTED-PRECISION  WEIGHTED-RECALL  WEIGHTED-F1
    0     1  DECISIONFOREST_0               rfe  0.833333         0.833333  ...      0.822820  0.824011            0.832799         0.833333     0.833012
    1     2  DECISIONFOREST_1               rfe  0.809524         0.809524  ...      0.799629  0.799629            0.809524         0.809524     0.809524
    2     3  DECISIONFOREST_3               pca  0.809524         0.809524  ...      0.769944  0.783070            0.821440         0.809524     0.799904
    3     4  DECISIONFOREST_2               pca  0.785714         0.785714  ...      0.754174  0.763009            0.786184         0.785714     0.779310
    4     5         XGBOOST_1               pca  0.761905         0.761905  ...      0.764378  0.755751            0.772947         0.761905     0.764366
    [5 rows x 13 columns]
  6. Display the best performing model.
    >>> aml.leader()
       RANK          MODEL_ID FEATURE_SELECTION  ACCURACY  MICRO-PRECISION  ...  MACRO-RECALL  MACRO-F1  WEIGHTED-PRECISION  WEIGHTED-RECALL  WEIGHTED-F1
    0     1  DECISIONFOREST_0               rfe  0.833333         0.833333  ...       0.82282  0.824011            0.832799         0.833333     0.833012
    [1 rows x 13 columns]
  7. Display hyperparameters for trained model.
    1. Display model hyperparameters for rank 1.
      >>> aml.model_hyperparameters(rank=1)
      {'response_column': 'survived', 
        'name': 'decision_forest', 
        'tree_type': 'Classification', 
        'min_impurity': 0.1, 
        'max_depth': 6, 
        'min_node_size': 3, 
        'num_trees': -1, 
        'seed': 42, 
        'persist': False, 
        'output_prob': True, 
        'output_responses': ['1', '0'], 
        'max_models': 2}
      
    2. Display hyperparameters for rank 4.
      >>> aml.model_hyperparameters(rank=4)
      {'response_column': 'survived', 
        'name': 'decision_forest', 
        'tree_type': 'Classification', 
        'min_impurity': 0.1, 
        'max_depth': 8, 
        'min_node_size': 2, 
        'num_trees': -1, 
        'seed': 42, 
        'persist': False, 
        'output_prob': True, 
        'output_responses': ['1', '0'], 
        'max_models': 2}
      
  8. Generate prediction on test dataset using best performing model.
    >>> prediction = aml.predict(titanic_test)
    2025-11-04 02:03:17,753 | INFO     | Data Transformation started ...
    2025-11-04 02:03:17,754 | INFO     | Performing transformation carried out in feature engineering phase ...
    2025-11-04 02:03:18,354 | INFO     | Updated dataset after dropping futile columns :
       passenger  survived  pclass     sex   age  sibsp  parch     fare cabin embarked  automl_id
    0        137         1       1  female  19.0      0      2  26.2833   D47        S         15
    1        814         0       3  female   6.0      4      2  31.2750  None        S          8
    2        812         0       3    male  39.0      0      0  24.1500  None        S         12
    3        734         0       2    male  23.0      0      0  13.0000  None        S          6
    4        793         0       3  female   NaN      8      2  69.5500  None        S         14
    5        265         0       3  female   NaN      0      0   7.7500  None        Q          5
    6        244         0       3    male  22.0      0      0   7.1250  None        S          9
    7        101         0       3  female  28.0      0      0   7.8958  None        S         13
    8        345         0       2    male  36.0      0      0  13.0000  None        S         10
    9         61         0       3    male  22.0      0      0   7.2292  None        C          4
    178 rows X 11 columns
    2025-11-04 02:03:18,648 | INFO     | Updated dataset after performing target column transformation :
       passenger  survived  pclass     sex   age  sibsp  parch     fare cabin embarked  automl_id
    0        793         0       3  female   NaN      8      2  69.5500  None        S         14
    1        730         0       3  female  25.0      1      0   7.9250  None        S         11
    2        137         1       1  female  19.0      0      2  26.2833   D47        S         15
    3        265         0       3  female   NaN      0      0   7.7500  None        Q          5
    4        101         0       3  female  28.0      0      0   7.8958  None        S         13
    5         61         0       3    male  22.0      0      0   7.2292  None        C          4
    6        814         0       3  female   6.0      4      2  31.2750  None        S          8
    7        812         0       3    male  39.0      0      0  24.1500  None        S         12
    8        244         0       3    male  22.0      0      0   7.1250  None        S          9
    9         19         0       3  female  31.0      1      0  18.0000  None        S          7
    178 rows X 11 columns
    2025-11-04 02:03:18,930 | INFO     | Updated dataset after dropping customized missing value containing columns :
       passenger  survived  pclass     sex   age  sibsp  parch     fare embarked  automl_id
    0        137         1       1  female  19.0      0      2  26.2833        S         15
    1        345         0       2    male  36.0      0      0  13.0000        S         10
    2        793         0       3  female   NaN      8      2  69.5500        S         14
    3        265         0       3  female   NaN      0      0   7.7500        Q          5
    4        101         0       3  female  28.0      0      0   7.8958        S         13
    5         61         0       3    male  22.0      0      0   7.2292        C          4
    6        814         0       3  female   6.0      4      2  31.2750        S          8
    7        812         0       3    male  39.0      0      0  24.1500        S         12
    8        244         0       3    male  22.0      0      0   7.1250        S          9
    9        734         0       2    male  23.0      0      0  13.0000        S          6
    178 rows X 10 columns
    2025-11-04 02:03:19,892 | INFO     | Updated dataset after imputing customized missing value containing columns :
       passenger  survived  pclass     sex  age  sibsp  parch     fare embarked  automl_id
    0        101         0       3  female   28      0      0   7.8958        S         13
    1        345         0       2    male   36      0      0  13.0000        S         10
    2        793         0       3  female   28      8      2  69.5500        S         14
    3         19         0       3  female   31      1      0  18.0000        S          7
    4        137         1       1  female   19      0      2  26.2833        S         15
    5         61         0       3    male   22      0      0   7.2292        C          4
    6        814         0       3  female    6      4      2  31.2750        S          8
    7        812         0       3    male   39      0      0  24.1500        S         12
    8        730         0       3  female   25      1      0   7.9250        S         11
    9        734         0       2    male   23      0      0  13.0000        S          6
    178 rows X 10 columns
    2025-11-04 02:03:21,745 | INFO     | Updated dataset after performing customized equal width bin-code transformation :
              passenger  age  parch      fare embarked  automl_id     sex  sibsp    pclass
    survived
    1                89   23      2  263.0000        S         61  female      3  pclass_1
    1               371   25      0   55.4417        C         77    male      1  pclass_1
    1               752    6      1   12.4750        S         93    male      0  pclass_3
    1               872   47      1   52.5542        S        101  female      1  pclass_1
    1               805   27      0    6.9750        S        157    male      0  pclass_3
    1               517   34      0   10.5000        S        165  female      0  pclass_2
    0               101   28      0    7.8958        S         13  female      0  pclass_3
    0               404   28      0   15.8500        S         25    male      1  pclass_3
    0               873   33      0    5.0000        S         29    male      0  pclass_1
    0                34   66      0   10.5000        S         33    male      0  pclass_2
    178 rows X 10 columns
    2025-11-04 02:03:23,335 | INFO     | Updated dataset after performing customized categorical encoding :
              survived  passenger  age  parch      fare  pclass  automl_id     sex  sibsp
    embarked
    0.533835         0        378   27      2  211.5000       0        136    male      0
    0.533835         0        558   28      0  227.5250       0         66    male      0
    0.533835         0        525   28      0    7.2292       2         98    male      0
    0.533835         0        790   46      0   79.2000       0        102    male      0
    0.533835         0        621   27      0   14.4542       2         35    male      1
    0.533835         0        178   50      0   28.7125       0        109  female      0
    0.400000         0        422   21      0    7.7333       2        113    male      0
    0.400000         0        791   28      0    7.7500       2         26    male      0
    0.400000         0        891   32      0    7.7500       2        106    male      0
    0.400000         0        768   30      0    7.7500       2         23  female      0
    178 rows X 10 columns
    2025-11-04 02:03:24,331 | INFO     | Updated dataset after performing categorical encoding :
              survived  passenger  age  parch      fare  pclass  automl_id  sex_0  sex_1  sibsp
    embarked
    0.400000         1         29   28      0    7.8792       2        127      1      0      0
    0.400000         1        413   33      0   90.0000       0         48      1      0      1
    0.400000         0        526   40      0    7.7500       2         16      0      1      0
    0.400000         0        779   28      0    7.7375       2         68      0      1      0
    0.400000         0        655   18      0    6.7500       2         81      1      0      0
    0.400000         1        157   16      0    7.7333       2        115      1      0      0
    0.533835         0        378   27      2  211.5000       0        136      0      1      0
    0.533835         0        558   28      0  227.5250       0         66      0      1      0
    0.533835         0        525   28      0    7.2292       2         98      0      1      0
    0.533835         0        790   46      0   79.2000       0        102      0      1      0
    178 rows X 11 columns
    2025-11-04 02:03:25,179 | INFO     | Updated dataset after performing customized non-linear transformation :
              survived  passenger  age  parch     fare  pclass  automl_id  sex_0  sex_1  sibsp  family_count
    embarked
    0.327519         0        656   24    0.0  73.5000       1        177      0      1    2.0           3.0
    0.327519         0        564   28    0.0   8.0500       2         28      0      1    0.0           1.0
    0.327519         0        522   22    0.0   7.8958       2         32      0      1    0.0           1.0
    0.327519         0        739   28    0.0   7.8958       2         56      0      1    0.0           1.0
    0.327519         0        651   28    0.0   7.8958       2         92      0      1    0.0           1.0
    0.327519         0        180   36    0.0   0.0000       2         96      0      1    0.0           1.0
    0.327519         0        611   39    5.0  31.2750       2         88      1      0    1.0           7.0
    0.327519         0        814    6    2.0  31.2750       2          8      1      0    4.0           7.0
    0.327519         0        641   20    0.0   7.8542       2        161      0      1    0.0           1.0
    0.327519         0        214   30    0.0  13.0000       1        145      0      1    0.0           1.0
    178 rows X 12 columns
    2025-11-04 02:03:25,825 | INFO     | Updated dataset after performing customized anti-selection :
       embarked  survived  age  parch     fare  pclass  automl_id  sex_0  sex_1  sibsp  family_count
    0  0.533835         0   50    0.0  28.7125       0        109      1      0    0.0           1.0
    1  0.533835         1   56    1.0  83.1583       0         72      1      0    0.0           2.0
    2  0.533835         1   32    0.0  30.5000       0        176      0      1    0.0           1.0
    3  0.533835         1   15    0.0  14.4542       2         38      1      0    1.0           2.0
    4  0.533835         1   25    0.0  91.0792       0         86      0      1    1.0           2.0
    5  0.533835         1    1    2.0  37.0042       1        114      0      1    0.0           3.0
    6  0.400000         1   28    0.0   7.8792       2        127      1      0    0.0           1.0
    7  0.400000         1   33    0.0  90.0000       0         48      1      0    1.0           2.0
    8  0.400000         0   40    0.0   7.7500       2         16      0      1    0.0           1.0
    9  0.400000         0   28    0.0   7.7375       2         68      0      1    0.0           1.0
    178 rows X 11 columns
    2025-11-04 02:03:26,057 | INFO     | Performing transformation carried out in data preparation phase ...
    2025-11-04 02:03:26,803 | INFO     | Updated dataset after performing RFE feature selection:
              automl_id  age  sex_1  pclass  sex_0     fare  family_count
    survived
    1               169   17      0       0      1  57.0000           2.0
    1                36   63      0       0      1  77.9583           2.0
    1                64   45      1       0      0  26.5500           1.0
    1                76   26      0       2      1   7.9250           1.0
    1               108   32      1       2      0  56.4958           1.0
    1               116   30      0       0      1  86.5000           1.0
    0               177   24      1       1      0  73.5000           3.0
    0                28   28      1       2      0   8.0500           1.0
    0                32   22      1       2      0   7.8958           1.0
    0                56   28      1       2      0   7.8958           1.0
    178 rows X 8 columns
    2025-11-04 02:03:27,562 | INFO     | Updated dataset after performing scaling on RFE selected features :
       survived  r_sex_0  r_sex_1  automl_id     r_age  r_pclass    r_fare  r_family_count
    0         0        0        1         32  0.372549       1.0  0.138523             0.0
    1         0        0        1         92  0.490196       1.0  0.138523             0.0
    2         0        0        1         96  0.647059       1.0  0.000000             0.0
    3         0        0        1        104  0.490196       1.0  0.141228             0.0
    4         0        0        1        144  0.411765       1.0  0.282456             0.5
    5         0        1        0        152  0.490196       1.0  1.220175             5.0
    6         1        1        0        101  0.862745       0.0  0.922004             1.0
    7         1        1        0        165  0.607843       0.5  0.184211             0.0
    8         1        1        0        169  0.274510       0.0  1.000000             0.5
    9         1        0        1        181  0.647059       0.0  0.461184             0.0
    178 rows X 8 columns
    2025-11-04 02:03:28,740 | INFO     | Updated dataset after performing scaling for PCA feature selection :
       survived  parch  sex_1  sex_0  automl_id  embarked       age      fare  pclass  sibsp  family_count
    0         1    0.0      0      1        169 -0.000267  0.274510  1.000000     0.0    0.5           0.5
    1         1    0.0      0      1         36 -0.000267  1.176471  1.367689     0.0    0.5           0.5
    2         1    0.0      1      0         64 -0.000267  0.823529  0.465789     0.0    0.0           0.0
    3         1    0.0      0      1         76 -0.000267  0.450980  0.139035     1.0    0.0           0.0
    4         1    0.0      1      0        108 -0.000267  0.568627  0.991154     1.0    0.0           0.0
    5         1    0.0      0      1        116 -0.000267  0.529412  1.517544     0.0    0.0           0.0
    6         0    0.0      1      0        177 -0.000267  0.411765  1.289474     0.5    1.0           1.0
    7         0    0.0      1      0         28 -0.000267  0.490196  0.141228     1.0    0.0           0.0
    8         0    0.0      1      0         32 -0.000267  0.372549  0.138523     1.0    0.0           0.0
    9         0    0.0      1      0         56 -0.000267  0.490196  0.138523     1.0    0.0           0.0
    178 rows X 11 columns
    2025-11-04 02:03:29,109 | INFO     | Updated dataset after performing PCA feature selection :
       automl_id     col_0     col_1     col_2     col_3     col_4     col_5  survived
    0        101  1.219747 -0.881510  0.328327  0.102559  0.086625  0.192121         1
    1        177 -0.124017 -0.926226  1.024970  0.066431  0.471474 -0.425253         0
    2        165  0.855218  0.148332 -0.454413 -0.120779 -0.011061  0.136053         1
    3         28 -0.584001  0.264865 -0.047374 -0.146007 -0.008639  0.028028         0
    4        169  1.148406 -0.670955  0.014315  0.021998  0.178162 -0.459917         1
    5         32 -0.581800  0.281829 -0.036110 -0.162284 -0.047981 -0.077502         0
    6        181 -0.402240 -0.649165 -0.455126  0.147058 -0.005639 -0.032572         1
    7         56 -0.584185  0.265557 -0.047415 -0.146158 -0.009595  0.028910         0
    8         36  1.155128 -0.889723 -0.066752  0.166061  0.602480  0.236110         1
    9         92 -0.584185  0.265557 -0.047415 -0.146158 -0.009595  0.028910         0
    10 rows X 8 columns
    2025-11-04 02:03:29,439 | INFO     | Data Transformation completed.█████| 100% - 14/14
    2025-11-04 02:03:29,985 | INFO     | Following model is being picked for evaluation:
    2025-11-04 02:03:29,985 | INFO     | Model ID : DECISIONFOREST_0
    2025-11-04 02:03:29,985 | INFO     | Feature Selection Method : rfe
    2025-11-04 02:03:30,722 | INFO     | Applying SHAP for Model Interpretation...
    2025-11-04 02:03:32,813 | INFO     | SHAP Analysis Completed. Feature Importance Available.
    /root/automl_testing/pyTeradata/teradataml/automl/model_evaluation.py:380: UserWarning: FigureCanvasAgg is non-interactive, and thus cannot be shown
      plt.show()
    2025-11-04 02:03:32,886 | INFO     | Prediction :
       automl_id  prediction  prob_1  prob_0  survived
    0        169           1     1.0     0.0         1
    1         36           1     1.0     0.0         1
    2         64           1     1.0     0.0         1
    3         76           0     0.0     1.0         1
    4        108           1     1.0     0.0         1
    5        116           1     1.0     0.0         1
    6        177           1     1.0     0.0         0
    7         28           0     0.0     1.0         0
    8         32           0     0.0     1.0         0
    9         56           0     0.0     1.0         0
    2025-11-04 02:03:34,773 | INFO     | ROC-AUC :
                  GINI
    AUC
    0.665346  0.330691
       threshold_value       tpr       fpr
    0         0.040816  0.797297  0.259615
    1         0.081633  0.797297  0.259615
    2         0.102041  0.797297  0.259615
    3         0.122449  0.797297  0.259615
    4         0.163265  0.797297  0.259615
    5         0.183673  0.797297  0.259615
    6         0.142857  0.797297  0.259615
    7         0.061224  0.797297  0.259615
    8         0.020408  0.797297  0.259615
    9         0.000000  1.000000  1.000000
    2025-11-04 02:03:35,155 | INFO     | Confusion Matrix :
    [[77 27]
     [15 59]]
    >>> prediction.head()
       automl_id  prediction  prob_1  prob_0  survived
    0        169           1     1.0     0.0         1
    1         36           1     1.0     0.0         1
    2         64           1     1.0     0.0         1
    3         76           0     0.0     1.0         1
    4        108           1     1.0     0.0         1
    5        116           1     1.0     0.0         1
    6        177           1     1.0     0.0         0
    7         28           0     0.0     1.0         0
    8         32           0     0.0     1.0         0
    9         56           0     0.0     1.0         0
  9. Generate evaluation metrics on test dataset using best performing model.
    >>> performance_metrics = aml.evaluate(titanic_test)
    2025-11-04 02:04:10,024 | INFO     | Skipping data transformation as data is already transformed.
    2025-11-04 02:04:10,567 | INFO     | Following model is being picked for evaluation:
    2025-11-04 02:04:10,567 | INFO     | Model ID : DECISIONFOREST_0
    2025-11-04 02:04:10,567 | INFO     | Feature Selection Method : rfe
    2025-11-04 02:04:13,988 | INFO     | Performance Metrics :
           Prediction  Mapping  CLASS_1  CLASS_2  Precision    Recall        F1  Support
    SeqNum
    1               1  CLASS_2       27       59   0.686047  0.797297  0.737500       74
    0               0  CLASS_1       77       15   0.836957  0.740385  0.785714      104
    --------------------------------------------------------------------------------
       SeqNum              Metric  MetricValue
    0       3        Micro-Recall     0.764045
    1       5     Macro-Precision     0.761502
    2       6        Macro-Recall     0.768841
    3       7            Macro-F1     0.761607
    4       9     Weighted-Recall     0.764045
    5      10         Weighted-F1     0.765670
    6       8  Weighted-Precision     0.774219
    7       4            Micro-F1     0.764045
    8       2     Micro-Precision     0.764045
    9       1            Accuracy     0.764045
    >>> performance_metrics
           Prediction  Mapping  CLASS_1  CLASS_2  Precision    Recall        F1  Support
    SeqNum
    0               0  CLASS_1       77       15   0.836957  0.740385  0.785714      104
    1               1  CLASS_2       27       59   0.686047  0.797297  0.737500       74