Key Feature Additions and Changes | Teradata Package for Python 20.00 - Key Feature Additions and Changes - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-12-05
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

The following table lists the key feature additions and changes in the Teradata Package for Python, teradataml.

Date Release Description
November 2025 20.00.00.08
  • New Features/Functionality
    • Extending compatibility for IBM PowerPC with python >=3.9.
    • teradataml: AutoML
      • New methods added for AutoML(), AutoRegressor(), AutoClassifier(), AutoFraud(), AutoChurn(), and AutoCluster().

        get_transformed_data() - Returns the transformed data obtained from all feature selection methods for given input data.

        get_raw_data_with_id() - Returns the raw input data along with the ID column mapping.

        get_error_logs() - Returns the error logs for failed models generated during execution of AutoML.

  • Updates
    • teradataml: AutoML
      • New argument added for AutoML(), AutoRegressor(), AutoClassifier(), AutoFraud(), AutoChurn(), and AutoDataPrep().

        Added enable_lasso to use lasso based feature selection during data preparation. By default, only 'RFE' and 'PCA' are enabled for feature selection.

      • New arguments added for AutoML(), AutoRegressor(), AutoClassifier(), AutoFraud(), AutoChurn(), and AutoCluster().

        Added raise_errors to control whether non-blocking issues raise errors or only warnings. By default, it does not raise errors and continues processing with a user warning.

      • New argument added for fit().

        Added id_column to specify the ID column present in input data. By default, an AutoML-generated ID column automl_id is enabled for processing if user does not provide one.

      • New argument added to predict().

        Added preserve_columns to preserve columns from the transformed data in prediction DataFrame.

  • Bug Fixes
    • Teradata BYOM will now work with Secure Zone--CHBIT08
November 2025 20.00.00.07
  • New Features/Functionality
    • Extending compatibility for Linux with ARM processors.
    • teradataml: DataFrame
      • DataFrame.df_type - Added new property df_type to know the type of the DataFrame.
      • DataFrame.as_of() - Function to support temporal time qualifiers on teradataml DataFrame.
      • DataFrame.closed_rows() - Function to retrieve closed rows from a DataFrame created on a transaction-time or bi-temporal table/view.
      • DataFrame.open_rows() - Function to retrieve open rows from a DataFrame created on a transaction-time or bi-temporal table/view.
      • DataFrame.historic_rows() - Function to retrieve historical rows from a DataFrame created on a valid-time or bi-temporal table/view
      • DataFrame.future_rows() - Function to retrieve future rows from a DataFrame created on a valid-time or bi-temporal table/view.
      • DataFrame.create_view() - Function to create a view from the DataFrame object. This function helps the user to persist the DataFrame as a view, which can be used across sessions.
      • Added argument persist to DataFrame.from_dict(), DataFrame.from_pandas(), and DataFrame.from_records() to persist the created DataFrame.
    • teradataml DataFrameColumn a.k.a. ColumnExpression
      • DataFrameColumn.begin() - Function to get beginning date or timestamp from a PERIOD column.
      • DataFrameColumn.end() - Function to get ending date or timestamp from a PERIOD column.
      • DataFrameColumn.between() - Function to check if the column value is between the lower and upper bounds.
    • teradataml: Functions
      • current_date() - Gets the current date based on the specified time zone.
      • current_timestamp() - Gets the current timestamp based on the specified time zone.
    • teradataml: General Functions
      • Data Transfer Utility
        • copy_to_sql()

          A new argument partition_by partitions the index while writing to the database.

          A new argument partition_by_case handles different cases for partitioning the index while writing to the database.

          A new argument partition_by_range partitions the data based on a range while writing to the database.

          A new argument sub_partition subpartitions the main partition according to the provided value.

          New keyword arguments valid_time_columns and derived_column helps to copy the data into temporal tables.

    • Enterprise Feature Store
      • FeatureStore - Main class for managing Feature Store operations with comprehensive methods and properties.
        • Properties:

          data_domain - Gets or sets the data domain of feature store.

          grant - Grants access to the FeatureStore.

          repo - Gets or sets the repository name.

          revoke - Revokes access from the FeatureStore.

          version - Gets the version of the FeatureStore.

        • Methods:

          apply() - Adds Feature, Entity, DataSource, FeatureGroup to FeatureStore.

          archive_data_source() - Archives a specified DataSource.

          archive_entity() - Archives a specified Entity.

          archive_feature() - Archives a specified Feature.

          archive_feature_group() - Archives a specified FeatureGroup.

          archive_feature_process() - Archives a specified FeatureProcess.

          delete() - Deletes the FeatureStore and all its components.

          delete_data_source() - Deletes an archived DataSource.

          delete_entity() - Deletes an archived Entity.

          delete_feature() - Deletes an archived Feature.

          delete_feature_group() - Deletes an archived FeatureGroup.

          delete_feature_process() - Deletes an archived FeatureProcess.

          get_data() - Gets data based on features, entities, and processes.

          get_data_domain() - Retrieves DataDomain object.

          get_data_source() - Gets DataSources associated with FeatureStore.

          get_dataset_catalog() - Retrieves the DatasetCatalog object.

          get_entity() - Gets Entity associated with FeatureStore.

          get_feature() - Gets Feature associated with FeatureStore.

          get_feature_group() - Gets FeatureGroup associated with FeatureStore.

          get_feature_process() - Retrieves FeatureProcess based on arguments.

          get_feature_catalog() - Retrieves FeatureCatalog object.

          get_group_features() - Gets features from a specific feature group.

          list_data_sources() - Lists DataSources in the FeatureStore.

          list_entities() - Lists Entities in the FeatureStore.

          list_feature_groups() - Lists FeatureGroups in the FeatureStore.

          list_features() - Lists Features in the FeatureStore.

          list_feature_processes() - Lists all feature processes in the repo.

          list_feature_runs() - Lists feature process runs and execution status.

          list_feature_catalogs() - Lists all feature catalogs in the repo.

          list_data_domains() - Lists all data domains in the repo.

          list_dataset_catalogs() - Lists all dataset catalogs in the repo.

          list_repos() - Lists available repos configured for FeatureStore.

          mind_map() - Generates a mind map visualization of the feature store structure.

          remove_data_domain() - Removes the data domain from the feature store.

          repair() - Repairs the underlying FeatureStore schema on database.

          set_features_active() - Marks Features as active.

          set_features_inactive() - Marks Features as inactive.

          setup() - Sets up the FeatureStore for a repository.

      • FeatureGroup - Represents a group of features with methods and properties.
        • Methods:

          apply() - Applies the feature group to objects.

          from_DataFrame() - Creates a FeatureGroup from a DataFrame.

          from_query() - Creates a FeatureGroup from a query.

          ingest_features() - Ingests features from the FeatureGroup into the FeatureStore.

          remove_feature() - Removes a feature from the FeatureGroup.

          reset_labels() - Resets the labels of the FeatureGroup.

          set_labels() - Sets the labels of the FeatureGroup.

        • Properties:

          features - Gets the features in the FeatureGroup.

          labels - Gets or sets the labels of the FeatureGroup.

      • DataDomain - Represents a data domain within the FeatureStore with properties.
        • Properties:

          entities - Gets the entities in the data domain.

          features - Gets the features in the data domain.

          processes - Gets the feature processes in the data domain.

          datasets - Gets the datasets in the data domain.

      • FeatureCatalog - Manages features within a specific data domain.
        • Properties:

          data_domain - Gets the data domain of the catalog.

          features - Gets the features in the catalog.

          entities - Gets the entities in the catalog.

        • Methods:

          upload_features() - Uploads features to the catalog.

          list_features() - Lists features in the catalog.

          list_feature_versions() - Lists feature versions in the catalog.

          archive_features() - Archives features in the catalog.

          delete_features() - Deletes features from the catalog.

      • DatasetCatalog - Manages datasets within a specific data domain.
        • Properties:

          data_domain - Gets the data domain of the catalog.

        • Methods:

          build_dataset() - Builds a dataset from features and entities.

          build_time_series() - Builds a time series dataset.

          list_datasets() - Lists datasets in the catalog.

          list_entities() - Lists entities available for dataset building.

          list_features() - Lists features available for dataset building.

          get_dataset() - Gets a specific dataset by ID.

          archive_datasets() - Archives datasets in the catalog.

          delete_datasets() - Deletes datasets from the catalog.

      • Dataset - Represents a specific dataset in the catalog.
        • Properties:

          features - Gets the features in the dataset.

          entity - Gets the entity of the dataset.

          view_name - Gets the view name of the dataset.

          id - Gets the ID of the dataset.

      • FeatureProcess - Represents a feature processing workflow.
        • Properties:

          process_id - Gets the process ID.

          df - Gets the DataFrame associated with the process.

          features - Gets the features in the process.

          entity - Gets the entity in the process.

          data_domain - Gets the data domain of the process.

          filters - Gets the filters applied to the process.

          as_of - Gets the as_of parameter of the process.

          description - Gets the description of the process.

          start_time - Gets the start time of the process.

          end_time - Gets the end time of the process.

          status - Gets the status of the process.

        • Methods:

          run() - Executes the feature process with optional filters and as_of parameters.

    • OpensourceML
      • td_sklearn - Supports input from OTF tables.
    • BYOM Function
      • ONNXSeq2Seq() - Applies sequence-to-sequence model in Vantage that has been created outside Vantage and stored in ONNX format.
    • teradataml: AutoFraud (Automated Machine Learning - Fraud Detection)

      AutoFraud is a special purpose AutoML pipeline designed for fraud detection tasks. It automates the end-to-end process of data preprocessing, feature engineering, model training, evaluation, and deployment to efficiently identify fraudulent activities.

      • Methods:

        __init__() - Instantiates an object of AutoFraud.

        fit() - Performs fit on specified data and target column.

        leaderboard() - Gets the leaderboard for the AutoFraud pipeline, with diverse models, feature selection methods, and performance metrics.

        leader() - Shows best performing model and its details such as feature selection method and performance metrics.

        predict() - Performs prediction on the data using the best model or the model of user's choice from the leaderboard.

        evaluate() - Performs evaluation on the data using the best model or the model of user's choice from the leaderboard.

        load() - Loads the saved model from database.

        deploy() - Saves the trained model inside database.

        remove_saved_model() - Removes the saved model in database.

        model_hyperparameters() - Returns the hyperparameters of fitted or loaded models.

        get_persisted_tables() - Lists the persisted tables created during AutoFraud execution.

        visualize() - Generates visualizations to analyze and understand the underlying patterns in the data.

        generate_custom_config() - Generates custom config JSON file required for customized run of AutoFraud.

    • teradataml: AutoChurn (Automated Machine Learning - Churn Prediction)

      AutoChurn is a special purpose AutoML pipeline for customer churn prediction. It automates the end-to-end process of data preprocessing, feature engineering, model training, evaluation, and deployment to efficiently identify customers likely to churn.

      • Methods:

        __init__() - Instantiates an object of AutoChurn.

        fit() - Performs fit on specified data and target column.

        leaderboard() - Gets the leaderboard for the AutoChurn pipeline, with diverse models, feature selection methods, and performance metrics.

        leader() - Shows best performing model and its details such as feature selection method and performance metrics.

        predict() - Performs prediction on the data using the best model or the model of user's choice from the leaderboard.

        evaluate() - Performs evaluation on the data using the best model or the model of user's choice from the leaderboard.

        load() - Loads the saved model from database.

        deploy() - Saves the trained model inside database.

        remove_saved_model() - Removes the saved model in database.

        model_hyperparameters() - Returns the hyperparameters of fitted or loaded models.

        get_persisted_tables() - Lists the persisted tables created during AutoChurn execution.

        visualize() - Generates visualizations to analyze and understand the underlying patterns in the data.

        generate_custom_config() - Generates custom config JSON file required for customized run of AutoChurn.

    • teradataml: AutoCluster (Automated Machine Learning - Clustering)

      AutoCluster is a special purpose AutoML pipeline for clustering analysis. It automates the end-to-end process of data preprocessing, feature engineering, model training, and prediction to efficiently group data into clusters and extract insights from unlabeled datasets.

      • Methods:

        __init__() - Instantiates an object of AutoCluster.

        fit() - Performs fit on specified data.

        leaderboard() - Gets the leaderboard for the AutoCluster pipeline, with diverse models, feature selection methods, and performance metrics.

        leader() - Shows best performing model and its details such as feature selection method and performance metrics.

        predict() - Performs prediction (cluster assignment) on the data using the best model or the model of user's choice from the leaderboard.

        model_hyperparameters() - Returns the hyperparameters of fitted or loaded models.

        get_persisted_tables() - Lists the persisted tables created during AutoCluster execution.

        generate_custom_config() - Generates custom config JSON file required for customized run of AutoCluster.

  • Updates
    • teradataml: Functions
      • udf() - Added support for td_buffer to cache the data in the user defined function.
July 2025 20.00.00.06
  • New Features/Functionality
    • teradataml: SDK
      • Added new client teradataml.sdk.Client to make REST calls through SDK.
      • New exception added in teradataml, specifically for REST APIs TeradatamlRestException that has attribute json_response providing proper printable json.
      • Exposed different ways of authentication through Client.
      • Client credentials authentication through ClientCredentialsAuth class.
      • Device code authentication through DeviceCodeAuth class.
      • Bearer authentication through BearerAuth class.

    • teradataml: ModelOps SDK
      • teradataml exposes Python interfaces for all the REST APIs provided by Teradata Vantage ModelOps.
      • Added support for blueprint() method which prints available classes in modelops module.
      • Added new client ModelOpsClient with some additional function compared to teradataml.sdk.Client.
      • teradataml classes are added for the schema in ModelOps OpenAPI specification.
        >>> from teradataml.sdk.modelops import ModelOpsClient, Projects
        >>> from teradataml.common.exceptions import TeradatamlRestException
        >>> from teradataml.sdk import DeviceCodeAuth, BearerAuth, ClientCredentialsAuth # Authentication related classes.
        >>> from teradataml.sdk.modelops import models # All classes related to OpenAPI schema are present in this module.
        # Print available classes in modelops module.
        >>> from teradataml.sdk.modelops import blueprint
        >>> blueprint()
        # Create ClientCredentialsAuth object and create ModelOpsClient object.
        >>> cc_obj = ClientCredentialsAuth(auth_client_id="<client_id>",
                                           auth_client_secret="<client_secret>",
                                           auth_token_url="https://<example.com>/token")
        >>> client = ModelOpsClient(base_url="<base_url>", auth=cc_obj, ssl_verify=False)
        # Create Projects object.
        >>> p = Projects(client=client)
        # Create project using `body` argument taking object of ProjectRequestBody.
        >>> project_paylod = {
                "name": "dummy_project",
                "description": "dummy_project created for testing",
                "groupId": "<group_ID>",
                "gitRepositoryUrl": "/app/built-in/empty",
                "branch": "<branch>"
            }
        >>> p.create_project(body=models.ProjectRequestBody(**project_payload))
    • teradataml: Functions
      • get_formatters() - Get the formatters for NUMERIC, DATE and CHAR types.
    • teradataml: DataFrame Methods
      • get_snapshot() - Gets the snapshot data of a teradataml DataFrame created on OTF table for a given snapshot id or timestamp.
      • from_pandas(): Creates a teradataml DataFrame from a pandas DataFrame.
      • from_records(): Creates a teradataml DataFrame from a list.
      • from_dict(): Creates a teradataml DataFrame from a dictionary.
    • teradataml: DataFrame Property
      • history - Returns snapshot history for a DataFrame created on OTF table.
      • manifests - Returns manifest information for a DataFrame created on OTF table.
      • partitions - Returns partition information for a DataFrame created on OTF table.
      • snapshots - Returns snapshot information for a DataFrame created on OTF table.
    • teradataml DataFrameColumn a.k.a. ColumnExpression
      • DataFrameColumn.rlike() - Function to match a string against a regular expression pattern.
      • DataFrameColumn.substring_index() - Function to return the substring from a column before a specified delimiter, up to a given occurrence count.
      • DataFrameColumn.count_delimiters() - Function to count the total number of occurrences of a specified delimiter.
  • Updates
    • teradataml DataFrameColumn a.k.a. ColumnExpression
      • DataFrameColumn.like()
        • Added argument escape_char to specify the escape character for the LIKE pattern.
        • Argument pattern now accepts DataFrameColumn as input.
      • DataFrameColumn.ilike()
        • Argument pattern now accepts DataFrameColumn as input.
        • Added argument escape_char to specify the escape character for the ILIKE pattern.
      • DataFrameColumn.parse_url() - Added argument key to extract a specific query parameter when url_part is set to "QUERY".
      • teradataml: DataFrame function
        • groupby(), cube(), and rollup()

          Added argument include_grouping_columns to include aggregations on the grouping columns.

        • DataFrame(): New argument data, that accepts input data to create a teradataml DataFrame, is added.
      • teradataml Options
        • Configuration Options
          • configure.use_short_object_name specifies whether to use a shorter name for temporary database objects which are created by teradataml internally.
      • BYOM Function
        • Supports special characters.
May 2025 20.00.00.05
  • New Features/Functionality
    • teradataml AutoML
      • New methods added for AutoML(), AutoRegressor(), and AutoClassifier():
        • get_persisted_tables() - List the persisted tables created during AutoML execution.
        • visualize() - Generates visualizations to analyze and understand the underlying patterns in the data.
    • AutoDataPrep - Automated Data Preparation

      AutoDataPrep simplifies the data preparation process by automating the different aspects of data cleaning and transformation, enabling seamless exploration, transformation, and optimization of datasets.

      Methods of AutoDataPrep

      • __init__() - Instantiate an object of AutoDataPrep with given parameters.
      • fit() - Perform fit on specified data and target column.
      • get_data() - Retrieve the data after AutoDataPrep.
      • load() - Load the saved datasets from the database.
      • deploy() - Persist the datasets generated by AutoDataPrep in the database.
      • delete_data() - Deletes the deployed dataset from the database.
      • visualize() - Generates visualizations to analyze and understand the underlying patterns in the data.
    • teradataml: In-Database Analytic Functions
      • New Analytics Database Analytic Functions:

        Apriori()

        NERExtractor()

        TextMorph()

    • teradataml: Functions

      td_range() - Creates a DataFrame with a specified range of numbers.

    • teradataml DataFrameColumn a.k.a. ColumnExpression

      DataFrameColumn.to_number() - Function converts a string-like representation of a number to NUMBER type.

  • Updates
    • teradataml: DataFrame function

      DataFrame.agg(): You can request for different percentiles while running agg function.

      New argument debug is added to DataFrame.map_row(), DataFrame.map_partition(), DataFrame.apply(), and udf(). During the execution of these functions, teradataml internally generates scripts, which are garbage collected implicitly. To debug the failures, this argument allows you to control the garbage collection of the script. When set to False (default), script generated is garbage collected, otherwise script is not garbage collected and displays the path to the script, and you are responsible to remove the script if required.

      map_row(), map_partition(), and apply()
      • Raises a TeradataMlException, if the Python interpreter major version is different between the Vantage Python environment and the local user environment.
      • Displays a warning, if dill package version is different between the Vantage Python environment and the local user environment.

      DataFrame.describe() - Argument include is no longer supported.

      assign() - Optimized SQL query to enhance the performance for consecutive assign calls.

    • teradataml: Context Creation
      create_context()
      • Enables user to set the authentication token while creating the connection. This authentication token is required to access services running on Teradata Vantage.
      • New argument sql_timeout is added to specify timeout for SQL statement execution triggered from the current session.
    • teradataml: UAF Functions

      Integer type value is now accepted as a valid value for function arguments accepting float type.

    • General functions

      copy_to_sql() and read_csv() support the VECTOR data type.

    • teradataml DataFrameColumn a.k.a. ColumnExpression
      String Functions
      • DataFrameColumn.substr() - Arguments start_pos and length now accept DataFrameColumn as input.
      • DataFrameColumn.to_char() - Argument formatter now accepts DataFrameColumn as input.
    • teradataml: In-Database Analytic Functions
      Updated Analytics Database Analytic Functions:
      • SMOTE() is now supported on 17.20.00.00 as well.
      • TextParser()

        New arguments added: enforce_token_limit, delimiter_regex, doc_id_column, list_positions, token_frequency, output_by_word

March 2025 20.00.00.04
  • New Features/Functionality
    • teradataml: DataFrame
      • Introduced a new feature 'Exploratory Data Analysis UI' (EDA UI) to enhance the user experience of teradataml with Jupyter notebook. EDA UI is displayed by default when a teradataml DataFrame is printed in the Jupyter notebook.
      • You can control the EDA UI using a new configuration option display.enable_ui. You can disable by setting display.enable_ui to False.
      • EDA UI works based on ipywidgets. In some cases, ipywidgets do not work on a Jupyter notebook. If this occurs, Teradata recommends choosing a kernel again.
      • New Function

        get_output() is added to get the result of Analytic function when executed from EDA UI.

    • OpenSourceML
      • td_lightgbm - A teradataml OpenSourceML module
        • deploy() - You can now deploy the models created by lightgbm Booster and sklearn modules. Deploying the model stores the model in Vantage for future use with td_lightgbm.
          • td_lightgbm.deploy() - Deploy the lightgbm Booster or any scikit-learn model trained outside Vantage.
          • td_lightgbm.train().deploy() - Deploys the lightgbm Booster object trained within Vantage.
          • td_lightgbm.<sklearn_class>().deploy() - Deploys lightgbm's sklearn class object created/trained within Vantage.
        • load() - You can load the deployed models back in the current session so you can use the lightgbm functions with the td_lightgbm module.
          • td_lightgbm.load() - Load the deployed model in the current session.
    • FeatureStore

      FeatureStore.delete() function is added to drop the Feature Store and corresponding repo from Vantage.

    • Database Utility

      db_python_version_diff() - Identifies the Python interpreter major version difference between the interpreter installed on Vantage versus interpreter on the local user environment.

      db_python_package_version_diff() - Identifies the Python package version difference between the packages installed on Vantage versus the local user environment.

    • BYOM Function

      ONNXEmbeddings() - Calculate embeddings values in Vantage using an embeddings model that has been created outside Vantage and stored in ONNX format.

    • teradataml Options
      • Configuration Options

        configure.temp_object_type - You can create volatile tables or views for teradataml internal use. By default, teradataml internally creates the views for some of the operations. This new configuration option lets you create volatile tables instead of views. This provides greater flexibility for users who lack the necessary permissions to create view or need to create views on tables without WITH GRANT permissions.

      • Display Options

        display.enable_ui - Specifies whether to display exploratory data analysis UI when DataFrame is printed. By default, this option is enabled (True), allowing exploratory data analysis UI to be displayed. When set to False, exploratory data analysis UI is hidden.

  • Updates
    • teradataml: DataFrame function
      • describe()

        New argument added: pivot.

        When pivot is set to False, non-numeric columns are no longer supported for generating statistics. Use CategoricalSummary and ColumnSummary.

      • fillna(): Accepts new argument partition_column to partition the data and impute null values accordingly.
      • Optimized performance for DataFrame.plot()

        DataFrame.plot() will not regenerate the image when run more than once with same arguments.

    • Hyper Parameter Tuner

      GridSearch() and RandomSearch() now display a message to refer to get_error_log() API when model training fails in HPT.

    • teradataml Options

      Configuration Options: configure.indb_install_location determines the installation location of the In-DB Python package based on the installed RPM version.

    • teradataml Context Creation

      create_context(): Enables you to create connection using either parameters set in environment or config file, in addition to the previous method. Newly added options help users to hide the sensitive data from the script.

    • OpenSourceML

      Raises a TeradataMlException if the Python interpreter major version is different between the Teradata Package for Python environment and the local user environment.

      Displays a warning, if specific Python package versions are different between the Teradata Package for Python environment and the local user environment.

  • Bug Fixes
    • td_lightgbm OpenSourceML module: In multi model case, td_lightgbm.Dataset().add_features_from() function should add features of one partition in first Dataset to features of the same partition in second Dataset. This is not the case before and this function fails. This has been fixed.
    • Fixed a minor bug in the Shap() and converted argument training_method to required argument.
    • Fixed PCA-related warnings in AutoML.
    • AutoML no longer fails when data with all categorical columns are provided.
    • Fixed AutoML issue with upsampling method.
    • Excluded the identifier column from outlier processing in AutoML.
    • DataFrame.set_index() no longer modifies the original DataFrame's index when argument append is used.
    • concat() function now supports the DataFrame with column name starts with digit or contains special characters or contains reserved keywords.
    • create_env() proceeds to install other files even if current file installation fails.
    • Corrected the error message being raised in create_env() when authentication is not set.
    • Added missing argument charset for Vantage Analytics Library functions.
    • New argument seed is added to AutoML, AutoRegressor and AutoClassifier to ensure consistency on result.
    • Analytic functions now work even if name of columns for underlying tables is non-ascii characters.
December 2024 20.00.00.03
  • teradataml no longer supports setting the auth_token using set_config_params(). Use set_auth_token() to set the token.
  • New Features/Functionality
    • teradataml: DataFrame
      • New Function

        alias() - Creates a DataFrame with alias name.

      • New Property

        db_object_name - Gets the underlying database object name, on which DataFrame is created.

    • teradataml: GeoDataFrame
      • New Function

        alias() - Creates a GeoDataFrame with alias name.

    • teradataml: DataFrameColumn a.k.a. ColumnExpression
      • Arithmetic Functions
        • DataFrameColumn.isnan() - Evaluates expression to determine if the floating-point argument is a NaN (Not-a-Number) value.
        • DataFrameColumn.isinf() - Evaluates expression to determine if the floating-point argument is an infinite number.
        • DataFrameColumn.isfinite() - Evaluates expression to determine if it is a finite floating value.
    • FeatureStore - handles feature management within the environment
      • FeatureStore Components
        • Feature - Represents a feature which is used in ML Modeling.
        • Entity - Represents the columns which serves as uniqueness for the data used in ML Modeling.
        • DataSource - Represents the source of Data.
        • FeatureGroup - Collection of Feature, Entity and DataSource.
          • Methods
          • apply() - Adds Feature, Entity, DataSource to a FeatureGroup.
          • from_DataFrame() - Creates a FeatureGroup from teradataml DataFrame.
          • from_query() - Creates a FeatureGroup using a SQL query.
          • remove() - Removes Feature, Entity, or DataSource from a FeatureGroup.
          • reset_labels() - Removes the labels assigned to the FeatureGroup, that are set using set_labels().
          • set_labels() - Sets the Features as labels for a FeatureGroup.
          • Properties
          • features - Get the features of a FeatureGroup.
          • labels - Get the labels of FeatureGroup.
      • FeatureStore
        • Methods
        • apply() - Adds Feature, Entity, DataSource, FeatureGroup to FeatureStore.
        • archive_data_source() - Archives a specified DataSource from a FeatureStore.
        • archive_entity() - Archives a specified Entity from a FeatureStore.
        • archive_feature() - Archives a specified Feature from a FeatureStore.
        • archive_feature_group() - Archives a specified FeatureGroup from a FeatureStore. Method archives underlying Feature, Entity, DataSource also.
        • delete_data_source() - Deletes an archived DataSource.
        • delete_entity() - Deletes an archived Entity.
        • delete_feature() - Deletes an archived Feature.
        • delete_feature_group() - Deletes an archived FeatureGroup.
        • get_data_source() - Get the DataSources associated with FeatureStore.
        • get_dataset() - Get the teradataml DataFrame based on Features, Entities and DataSource from FeatureGroup.
        • get_entity() - Get the Entity associated with FeatureStore.
        • get_feature() - Get the Feature associated with FeatureStore.
        • get_feature_group() - Get the FeatureGroup associated with FeatureStore.
        • list_data_sources() - List DataSources.
        • list_entities() - List Entities.
        • list_feature_groups() - List FeatureGroups.
        • list_features() - List Features.
        • list_repos() - List available repos which are configured for FeatureStore.
        • repair() - Repairs the underlying FeatureStore schema on database.
        • set_features_active() - Marks the Features as active.
        • set_features_inactive() - Marks the Features as inactive.
        • setup() - Set up the FeatureStore for a repo.
        • Properties
        • repo - Property for FeatureStore repo.
        • grant - Property to Grant access on FeatureStore to user.
        • revoke - Property to Revoke access on FeatureStore from user.
    • teradataml: Table Operator Functions
      • Image2Matrix() - Converts an image into a matrix.
    • teradataml: Analytic Functions
      • Database Engine 20 Functions
        • CFilter()
        • NaiveBayes()
        • TDNaiveBayesPredict()
        • Shap()
        • SMOTE()
      • teradataml: Unbounded Array Framework (UAF) Functions:
        • CopyArt()
    • General Functions
      • File Management Functions
        • list_files() - List the installed files in Database.
    • OpenSourceML: LightGBM

      teradataml adds support for lightGBM package through OpensourceML (OpenML) feature.

      The following functionality is added:

      • td_lightgbm - Interface object to run lightgbm functions and classes through Database Engine 20.

        Example usage:

        from teradataml import td_lightgbm, DataFrame
         
        df_train = DataFrame("multi_model_classification")
         
        feature_columns = ["col1", "col2", "col3", "col4"]
        label_columns = ["label"]
        part_columns = ["partition_column_1", "partition_column_2"]
         
        df_x = df_train.select(feature_columns)
        df_y = df_train.select(label_columns)
         
        # Dataset creation.
        # Single model case.
        obj_s = td_lightgbm.Dataset(df_x, df_y, silent=True, free_raw_data=False)
         
        # Multi model case.
        obj_m = td_lightgbm.Dataset(df_x, df_y, free_raw_data=False, partition_columns=part_columns)
        obj_m_v = td_lightgbm.Dataset(df_x, df_y, free_raw_data=False, partition_columns=part_columns)
         
        ## Model training.
        # Single model case.
        opt = td_lightgbm.train(params={}, train_set = obj_s, num_boost_round=30)
         
        opt.predict(data=df_x, num_iteration=20, pred_contrib=True)
         
        # Multi model case.
        opt = td_lightgbm.train(params={}, train_set = obj_m, num_boost_round=30,
                                callbacks=[td_lightgbm.record_evaluation(rec)],
                                valid_sets=[obj_m_v, obj_m_v])
         
        # Passing `label` argument to get it returned in output DataFrame.
        opt.predict(data=df_x, label=df_y, num_iteration=20)
      • Added support for accessing scikit-learn APIs using exposed interface object td_lightgbm.
    • teradataml: Functions
      • register() - Registers a user defined function (UDF).
      • call_udf() - Calls a registered user defined function (UDF) and returns ColumnExpression.
      • list_udfs() - List all the UDFs registered using 'register()' function.
      • deregister() - Deregisters a user defined function (UDF).
    • teradataml: Options
      • Configuration Options

        table_operator - Specifies the name of table operator.

  • Updates
    • General functions

      set_auth_token() - Added base_url parameter which accepts the CCP url.

      'ues_url' will be deprecated in future, and you will need to specify 'base_url' instead.

    • teradataml: DataFrame function
      • join()

        Now supports compound ColumnExpression having more than one binary operator in on argument.

        Now supports ColumnExpression containing FunctionExpression(s) in on argument.

        self-join now expects aliased DataFrame in other argument.

    • teradataml: GeoDataFrame function
      • join()

        Now supports compound ColumnExpression having more than one binary operator in on argument.

        Now supports ColumnExpression containing FunctionExpressions in on argument.

        self-join now expects aliased DataFrame in other argument.

    • teradataml: Unbounded Array Framework (UAF) Functions
      • SAX() - Default value added for window_size and output_frequency.
      • DickeyFuller()

        Supports TDAnalyticResult as input.

        Default value added for max_lags.

        Removed parameter drift_trend_formula.

        Updated permitted values for algorithm.

    • teradataml: AutoML
      • AutoML, AutoRegressor, and AutoClassifier

        Now supports DECIMAL datatype as input.

    • teradataml: Database Engine 20 Analytic Functions
      • TextParser()

        Argument name covert_to_lowercase changed to convert_to_lowercase.

  • Bug Fixes:
    • db_list_tables() now returns correct results when '%' is used.
October 2024 20.00.00.02
  • New Features/Functionality
    • teradataml: Database Engine 20 Analytic Functions
      • New Database Engine 20 Analytic Functions:
        • TFIDF()
        • Unpivoting()
        • Pivoting()
      • New Unbounded Array Framework (UAF) Functions:
        • AutoArima()
        • DWT()
        • DWT2D()
        • FilterFactory1d()
        • IDWT()
        • IDWT2D()
        • IQR()
        • Matrix2Image()
        • SAX()
        • WindowDFFT()
    • teradataml: Functions
      • udf() - Creates a user defined function (UDF) and returns ColumnExpression.
      • materialize() - Persists dataframe into database for current session.
      • create_temp_view() - Creates a temporary view for session on the DataFrame.
    • teradataml: DataFrame
      • New function set_session_param() is added to set the database session parameters.
      • New function unset_session_param() is added to unset database session parameters.
    • teradataml DataFrameColumn a.k.a. ColumnExpression
      • _Date Time Functions_
        • DataFrameColumn.to_timestamp() - Converts string or integer value to a TIMESTAMP data type or TIMESTAMP WITH TIME ZONE data type.
        • DataFrameColumn.extract() - Extracts date component to a numeric value.
        • DataFrameColumn.to_interval() - Converts a numeric value or string value into INTERVAL_DAY_TO_SECOND or INTERVAL_YEAR_TO_MONTH value.
      • _String Functions_
        • DataFrameColumn.parse_url() - Extracts a part from a URL.
      • _Arithmetic Functions _
        • DataFrameColumn.log - Returns the logarithm value of the column with respect to 'base'.
    • teradataml: AutoML
      • New Methods added for AutoML(), AutoRegressor(), and AutoClassifier():
        • evaluate() - Added new method in AutoML to perform evaluation on the data using the best model or the model of users choice from the leaderboard.
        • New function added: load(), deploy() and remove_saved_model().
        • load() - Loads the saved model from database.
        • deploy() - Saves the trained model inside database.
        • remove_saved_model() - Removes the saved model in database.
        • model_hyperparameters() - Returns the hyperparameter of fitted or loaded models.
  • Updates
    • teradataml: AutoML
      • AutoML(), AutoRegressor()
        • New performance metrics added for task type regression i.e., "MAPE", "MPE", "ME", "EV", "MPD" and "MGD".
      • AutoML(), AutoRegressor() and AutoClassifier()
        • New arguments added: volatile, persist.
        • predict() - Data input is now mandatory for generating predictions. Default model evaluation is now removed.
    • teradataml: Options
      • set_config_params()
        • Following arguments will be deprecated in the future: ues_url and auth_token
    • DataFrameColumn.cast(): Accepts 2 new arguments format and timezone.
    • DataFrame.assign(): Accepts ColumnExpressions returned by udf().
  • teradataml DataFrame
    • to_pandas()- Function returns the pandas dataframe with Decimal column types as float instead of object. If user want to datatype to be object set argument coerce_float to False.
  • Database Utility
    • list_td_reserved_keywords() - Accepts a list of strings as argument.
  • Updates to existing UAF Functions:
    • ACF() - round_results parameter removed as it was used for internal testing.
    • BreuschGodfrey() - Added default_value 0.05 for parameter significance_level.
    • GoldfeldQuandt()
      • Removed parameters weights and formula.
      • Replaced parameter orig_regr_paramcnt with const_term.
      • Changed description for parameter algorithm. Please refer document for more details.
      • Note: This will break backward compatibility.
    • HoltWintersForecaster() - Default value of parameter seasonal_periods removed.
    • IDFFT2() - Removed parameter output_fmt_row_major as it is used for internal testing.
    • Resample() - Added parameter output_fmt_index_style.
  • Bug Fixes
    • KNN predict() function can now predict on test data which doesn't contain target column.
    • Metrics functions are supported on the Lake system.
    • The following OpenSourceML functions - sklearn modules are fixed.
      • sklearn.ensemble:
        • ExtraTreesClassifier - apply()
        • ExtraTreesRegressor - apply()
        • RandomForestClassifier - apply()
        • RandomForestRegressor - apply()
      • sklearn.impute:
        • SimpleImputer - transform(), fit_transform(), inverse_transform()
        • MissingIndicator - transform(), fit_transform()
      • sklearn.kernel_approximations:
        • Nystroem - transform(), fit_transform()
        • PolynomialCountSketch - transform(), fit_transform()
        • RBFSampler - transform(), fit_transform()
      • sklearn.neighbours:
        • KNeighborsTransformer - transform(), fit_transform()
        • RadiusNeighborsTransformer - transform(), fit_transform()
      • sklearn.preprocessing:
        • KernelCenterer - transform()
        • OneHotEncoder - transform(), inverse_transform()
    • OpenSourceML returns teradataml objects for model attributes and functions instead of sklearn objects so that the user can perform further operations like score(), predict() etc on top of the returned objects.
    • AutoML predict() function now generates correct ROC-AUC value for positive class.
    • deploy() method of Script and Apply classes retries model deployment if there is any intermittent network issues.
August 2024 20.00.00.01
  • teradataml no longer supports Python versions less than 3.8.
  • Added new feature - Personal Access Token (PAT) support in teradataml

    set_auth_token() - teradataml now supports authentication via PAT in addition to OAuth 2.0 Device Authorization Grant (formerly known as the Device Flow).

    It accepts UES URL, Personal AccessToken (PAT), and Private Key file generated from VantageCloud Lake Console and optional argument username and expiration_time in seconds.

  • Updated Database Engine 20 analytic functions:
    • ANOVA() - New arguments added: group_name_column, group_value_name, group_names, num_groups for data containing group values and group names.
    • FTest() - New arguments added: sample_name_column, sample_name_value, first_sample_name, second_sample_name.
    • GLM()
      • Supports stepwise regression and accept new arguments stepwise_direction, max_steps_num, and initial_stepwise_columns.
      • New arguments added: attribute_data, parameter_data, iteration_mode, and partition_column.
    • GetFutileColumns() - Arguments category_summary_column and threshold_value are now optional.
    • KMeans() - New argument added: initialcentroids_method.
    • NonLinearCombineFit() - Argument result_column is now optional.
    • ROC() - Argument positive_class is now optional.
    • SVMPredict() - New argument added: model_type.
    • ScaleFit()
      • New arguments added: ignoreinvalid_locationscale, unused_attributes, attribute_name_column, attribute_value_column.
      • Arguments attribute_name_column, attribute_value_column, and target_attributes are supported for sparse input.
      • Arguments attribute_data, parameter_data, and partition_column are supported for partitioning.
    • ScaleTransform() - New arguments added: attribute_name_column and attribute_value_column support for sparse input.
    • TDGLMPredict() - New arguments added: family and partition_column.
    • XGBoost() - New argument base_score is added for initial prediction value for all data points.
    • XGBoostPredict() - New argument detailed is added for detailed information of each prediction.
    • ZTest() - New arguments added: sample_name_column, sample_value_column, first_sample_name, and second_sample_name.
  • teradataml: AutoML
    • AutoML(), AutoRegressor(), and AutoClassifier() - New argument max_models is added as an early stopping criterion to limit the maximum number of models to be trained.
  • teradataml: DataFrame functions
    • DataFrame.agg() - Accepts ColumnExpressions and list of ColumnExpressions as arguments.
  • teradataml: General Functions
    • Data Transfer Utility - fastload() updates
      • Improved error and warning table handling with following new arguments.
        • err_staging_db
        • err_tbl_name
        • warn_tbl_name
        • err_tbl_1_suffix
        • err_tbl_2_suffix
    • Change in behavior of save_errors argument. When save_errors is set to True, error information will be available in two persistent tables ERR_1 and ERR_2. When save_errors is set to False, error information will be available in single pandas dataframe.
    • Garbage collector location is now configurable. You can set configure.local_storage to a desired location.
  • Updates:
    • UAF functions now work if the database name has special characters.
    • OpenSourceML can now read and process NULL/nan values.
    • Boolean values output will now be returned as VARBYTE column with 0 or 1 values in OpenSourceML.
    • Fixed bug for Apply's deploy().
    • Issue with volatile table creation is fixed where it is created in the correct database, i.e., user's spool space, regardless of the temp database specified.
    • ColumnTransformer function now processes its arguments in the order they are passed.
March 2024 20.00.00.00
  • Added new feature - teradataml open-source machine learning functions (teradataml OpenSourceML) that dynamically exposes open-source packages through Teradata Vantage. It provides an interface object through which exposed classes and functions of open-source packages can be accessed with the same syntax and arguments.
  • Added new feature - AutoML that automates the process of building, training, and validating machine learning models. It involves automation of various aspects of the machine learning workflow, such as feature exploration, feature engineering, data preparation, model training and evaluation for given dataset.
  • Added new deploy() method to deploy models generated after running script, in database when connected to VantageCloud Enterprise (as part of Script table operator), or in user environment when connected to VantageCloud Lake (as part of Apply table operator).
  • Added new DataFrame manipulation functions cube(), rollup(), replace.
  • Added eight categories of new DataFrame Column functions:
    • Bit Byte Manipulation Functions
    • Comparison Functions
    • Date Time Functions
    • Hyperbolic Functions
    • Regular Arithmetic Functions
    • Regular Expression Functions
    • String Functions
    • Trigonometric Functions
  • Removed functionalities that have been deprecated:
    • Machine Learning Engine functions
    • Model Cataloging feature
    • Sandbox feature that supports testing script in both Script table operator and Apply table operator.
February 2024 17.20.00.07 Updated Open Analytics Framework APIs to support VantageCloud Lake use of Anaconda for building conda environments to run Python analytic workload on Open Analytics Framework:
  • Updated create_env() with new argument conda_env to specify whether the environment to be created is a conda environment or not.
  • Output of list environment APIs have a new column "conda" to show whether the environment is a conda environment or not.
  • Updated set_auth_token to address Open Analytics Login Issue with teradataml 17.20.00.05 and 17.20.00.06.
  • Updated list_user_envs() with new argument conda_env to specify whether to filter the conda environments when listing user environments.
January 2024 17.20.00.06
  • New teradataml DataFrame Column functions:
    • 19 new Bit Byte Manipulation Functions
    • 4 new Regular Expression Functions
    • 2 new Display Functions
  • New and updated Open Analytics Framework APIs:
    • Updated create_env() so user can create one or more user environments using the new argument template by providing specifications in template json file.
    • New UserEnv Class property models, and methods install_model() and uninstall_model() to list, install and uninstall models in user environment.
    • New UserEnv Class method snapshot() to take snapshot of user environment.
  • New BYOM function DataRobotPredict() to score the data in Vantage using the model trained externally in datarobot and stored in Vantage.
  • Updated DataFrame functions:
    • DataFrame.describe() method to accept argument statistics to specify the aggregate operation to perform.
    • DataFrame.sort() method to accept ColumnExpression, and enable sorting.
    • DataFrame.sample() method to support column stratification.
  • Updated general function view_log() to download the APPLY query logs.
  • Updated Database Engine 20 analytic functions so arguments which accept floating numbers will accept integers.
  • Updated DataFrame.plot() function to ignore the null values while plotting data.
October 2023 17.20.00.05
  • New hyperparameter tuning feature to determine the optimal set of hyperparameters for the given dataset and learning model.
    • GridSearch algorithm covers all possible parameter values to identify optimal hyperparameters.
    • RandomSearch algorithm performs random sampling on hyperparameter space to identify optimal hyperparameters.
  • New plotting feature to visualize analytic results.
  • New teradataml DataFrame functions:
    • DataFrame.plot() to generate plots on teradataml DataFrame.
    • DataFrame.itertuples() to iterate over teradataml DataFrame rows as namedtuples or list.
  • New teradataml GeoDataFrame function GeoDataFrame.plot() to generate plots on teradataml GeoDataFrame.
  • New BYOM function DataikuPredict() to score the data in Vantage using the model trained externally in Dataiku UI and stored in Vantage.
  • New teradataml DataFrame Column functions:
    • Regular Arithmetic Functions
    • Trigonometric Functions
    • Hyperbolic Functions
    • String Functions
  • New general function async_run_status() to check the status of asynchronous runs using unique run ids.
  • New teradataml configuration option configure.indb_install_location to specify the installation location of in-database Python package.
  • Updated Open Analytics Framework APIs:
    • set_auth_token() does not accept username and password anymore. Instead, function opens up a browser session and user should authenticate in browser.
    • User environments, files and libraries related APIs updated to support R environment.
  • Updated Unbounded Array Framework (UAF) function ArimaEstimate() to support for CSS algorithm via algorithm argument.