The teradataml Package | Teradata Package for Python - The teradataml Package - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The teradataml package runs on the client system and is designed for data management, exploration, and execution of analytic functions.

The current version of the teradataml package includes over 100 functions, organized into these functional areas:
  • Utility and database management functions
  • Data exploration and preparation functions
  • Analytic functions across Vantage

    These functions support high-speed analytics processing required to operationalize analytics and automate data partitioning and parallel processing in Vantage.

teradataml, SQLAlchemy and Pandas

For Python users familiar with the Pandas Python package, the teradataml package builds on the concept and syntax of the pandas DataFrame object by creating the teradataml DataFrame object for data residing on Vantage.

The look and feel of a teradataml DataFrame is similar to a pandas DataFrame in Python. A teradataml DataFrame is a reference to a database object on the Python client, and it can represent a table, view, or query in Analytics Database.

The teradataml library provides an API to access or manipulate a teradataml DataFrame. These functions generate an SQL request to be executed in the Analytics Database or ML Engine through a Python DB-API connection. The teradataml package uses teradatasqlalchemy, an implementation of SQLAlchemy’s Dialect interface, to provide enhanced support for rendering SQL. The teradatasqlalchemy dialect leverages the teradatasql DB-API implementation for connecting to Vantage.

The teradataml DataFrame supports lazy evaluation for a variety of operations on Vantage such as exploration, transformation, and machine learning. Only a subset of data is ever retrieved from Vantage, unless the user explicitly requests data to be transferred to the client.

These characteristics result in a pandas-like experience for users who want to perform analytics on Vantage from their preferred Python environment.