DataFrame.apply() Setup | Teradata Package for Python - DataFrame.apply() Setup and Usage - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage
  • Set up the environment with required packages using Open Analytics Framework and use it for apply execution. Once the environment is created, pass the environment name or object of class UserEnv to the apply().
  • The function requires dill package with same version in both remote environment and local environment.
  • Teradata recommends using the same Python version in both the remote and local environments.
  • Teradata recommends using the same version of Python libraries between client machine and Analytics Database machine.

Input of Python function

The Python function can accept as many arguments as required, but must accept pandas Series object as its first (positional) argument corresponding to a row in the DataFrame. Thus, the user function has access to the data to process in a familiar format. Design the function to read from the Series object and manipulate the data accordingly.

Output of Python function

The Python function can either print the output to the standard output or return objects of any of the supported types so that they are printed to the standard output correctly, which include:
  • pandas DataFrame having the same number of columns as expected in the output.
  • pandas Series representing a row in the output of the method and having the same number of columns as the expected in the output.
  • numpy ndarray
    • One-dimensional: represents a row in the output, having the same number of columns as expected in the output.
    • Two-dimensional: represents a dataset (like a pandas DataFrame) having the same number of columns as expected in the output.

The object returned by the user function is printed to the standard output as delimited lines (rows), using the specified delimiter and quotechar.

If the user function prints the output directly to the standard output (instead of returning an object of the supported type), then it must take care of using the delimiter and quotechar, if and when specified, to format the output printed.

Data in the standard output is stored in a table and the table is garbage collected at the end of the session.