AutoML in teradataml - AutoML in teradataml - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
Language
English (United States)
Last Update
2024-12-18
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Automated Machine Learning (AutoML) represents a method for streamlining the entire process of machine learning pipeline in automated way. It encompasses various distinct phases of the machine learning pipeline, including feature exploration, features engineering, data preparation, model selection, model training with hyperparameters tuning, and model evaluation. By automating these tasks, AutoML eliminates the need for manual intervention by trained data scientists and reduces the prerequisite knowledge required for beginners. This accessibility allows individuals of varying expertise levels to effortlessly use AutoML to create machine learning models in an automated fashion.

The following diagram provides additional insights over AutoML approach.

automl insights

teradataml AutoML consists of five different phases covering different processes in automated way.

AutoML5phases
  • Feature Exploration​: It explores available features and provides insights​ such as column summary, categorical features distinct count, outlier percentage details, futile column details, and target column distribution.
  • Feature Engineering​: It handles data anomalies such as duplicate rows handling, missing value handling, futile column handling. Additionally, it executes various feature transformations based on the data types of the features.
  • Data Preparation​: It performs various steps to prepare the data for model training, including feature selection, feature scaling, and splitting the data into training and validation sets.
  • Model Training​: It performs hyperparameter tuning with available models.
  • Model Evaluation​: It assesses various trained models and generates a model leaderboard that includes performance metrics, detail on the applied feature selection method, and corresponding rankings for each model in ascending order. The model ranked 1 indicates the best-performing model for the given dataset.