AutoClassifier is a special purpose AutoML feature to run classification specific tasks.
configure.temp_object_type="VT" follows sequential execution.
Optional Arguments
- include
- Specifies the model algorithms to be used for model training phase.
By default, all five models are used for training for regression and binary classification problem, while only three models are used for multiclass.
Permitted values are "glm", "svm", "knn", "decision_forest", "xgboost".
- exclude
- Specifies the model algorithms to be excluded from model training phase.
No model is excluded by default.
Permitted values are "glm", "svm", "knn", "decision_forest", "xgboost".
- verbose
- Specifies the detailed execution steps based on verbose level.Permitted values are: *
- 0: prints the progress bar and leaderboard.
- 1: prints the execution steps of AutoML.
- 2: prints the intermediate data between the execution of each step of AutoML.
- max_runtime_secs
- Specifies the time limit in seconds for model training.
- stopping_metric
- Specifies the stopping metrics for stopping tolerance in model training.This argument is required if stopping_tolerance is set; otherwise, optional.Permitted values are:
- For task_type "Regression": "R2", "MAE", "MAPE", "MSE", "MSLE", "RMSE", "RMSLE", "MPE", "ME", "EV", "MPD", "MGD".
- For task_type "Classification": "MICRO-F1", "MACRO-F1", "MICRO-RECALL", "MACRO-RECALL", "MICRO-PRECISION", "MACRO-PRECISION", "WEIGHTED-PRECISION", "WEIGHTED-RECALL", "WEIGHTED-F1", "ACCURACY".
- stopping_tolerance
- Specifies the stopping tolerance for stopping metrics in model training.This argument is required if stopping_metric is set; otherwise, optional.
- max_models
- Specifies the maximum number of models to be trained.
- custom_config_file
- Specifies the path of JSON file in case of custom run.
- **kwargs
- Specifies additional arguments for AutoClassifier.
- seed
- Specifies the random seed for reproducibility.
Default value: 42
- persist
- Specifies whether to persist the interim results of the functions in a table or not. When set to True, results are persisted in a table; otherwise, results are garbage collected at the end of the session.
You must handle cleanup of persisted tables. Use get_persisted_tables() to view the list of persisted tables in the current session.
Default value: False
- seed
- Specifies the random seed for reproducibility.
Default value: 42
- imbalance_handling_method
- Specifies which data imbalance method to use for classification problems.
Default value: SMOTE
Permitted values: "SMOTE", "ADASYN", "SMOTETomek", "NearMiss".
- enable_lasso
- Specifies whether to use lasso regression for feature selection. By default, only RFE and PCA are used for feature selection.
Default value: False
- raise_errors
- Specifies whether to raise errors or warnings for non-blocking errors. When set to True, raises errors, otherwise raises warnings.
Default value: False