The teradataml package runs on the client system and is designed for data management, exploration, and execution of analytic functions.
- Utility and database management functions
- Data exploration and preparation functions
- Analytic functions across Vantage
These functions support high-speed analytics processing required to operationalize analytics and automate data partitioning and parallel processing in Vantage.
teradataml, SQLAlchemy and Pandas
For Python users familiar with the Pandas Python package, the teradataml package builds on the concept and syntax of the pandas DataFrame object by creating the teradataml DataFrame object for data residing on Vantage.
The look and feel of a teradataml DataFrame is similar to a pandas DataFrame in Python. A teradataml DataFrame is a reference to a database object on the Python client, and it can represent a table, view, or query in Analytics Database.
The teradataml library provides an API to access or manipulate a teradataml DataFrame. These functions generate an SQL request to be executed in the Analytics Database or ML Engine through a Python DB-API connection. The teradataml package uses teradatasqlalchemy, an implementation of SQLAlchemy’s Dialect interface, to provide enhanced support for rendering SQL. The teradatasqlalchemy dialect leverages the teradatasql DB-API implementation for connecting to Vantage.
The teradataml DataFrame supports lazy evaluation for a variety of operations on Vantage such as exploration, transformation, and machine learning. Only a subset of data is ever retrieved from Vantage, unless the user explicitly requests data to be transferred to the client.
These characteristics result in a pandas-like experience for users who want to perform analytics on Vantage from their preferred Python environment.