Use of teradataml UDF vs DataFrame Methods | Teradata Package for Python - Use of teradataml UDF versus DataFrame Methods - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2026-01-07
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Difference between apply, map_row, map_partition, and udf

DataFrame.apply() DataFrame.map_row() DataFrame.map_partition() udf()
Executes on every teradataml DataFrame row on VantageCloud Lake. Executes on every teradataml DataFrame row on VantageCloud Enterprise. Executes on group of teradataml DataFrame rows on VantageCloud Enterprise. Executes on every teradataml DataFrame row on VantageCloud Enterprise.
Returns teradataml DataFrame Returns teradataml DataFrame Returns teradataml DataFrame Returns teradataml DataFrame Column
Teradata recommends having the same Python interpreter version and same version of Python libraries, that are used inside the function, in the local client environment and the server-side user environment. Teradata recommends having the same Python interpreter version and same version of Python libraries, that are used inside function, in the local client environment and VantageCloud Enterprise. Teradata recommends having the same Python interpreter version and same version of Python libraries, that are used inside function, in the local client environment and VantageCloud Enterprise. Teradata recommends having the same Python interpreter version and same version of Python libraries, that are used inside the function, in the local client environment and the server-side user environment.
Lambda functions are supported. Lambda functions are supported. Lambda functions are supported. Lambda functions are not supported.

udf vs apply vs map_row vs map_partition: When to use what in teradataml

UDF/Method When to use
udf()
  • Use UDF for simplicity and ease of use, and use of functions over multiple sessions.
  • You want to a run a Python function over every teradataml DataFrame row and return a single value result for each row.
  • You want to return teradataml DataFrame column instead of teradataml DataFrame.

You can directly access each row’s column data by specifying the column name as an input to the Python function, unlike other function where you must design Python functions to read the data from the Series object or iterator (TextFileReader object) and manipulate it accordingly.

(VantageCloud Enterprise and VantageCore) With udf(), Python function can only return a single values, while DataFrame.map_row() and DataFrame.map_partition() allow Python functions to either print output to the standard output or return objects such as numpy 1-D or 2-D arrays, pandas Series, or pandas DataFrames.

DataFrame.apply()
  • Supported on VantageCloud Lake.
  • You want to a execute lambda function that returns numpy 1-D or 2-D arrays, pandas Series, or pandas DataFrames.
  • You want to apply a Python function to each teradataml DataFrame row and return the result as a teradataml DataFrame.
  • You want to run the function on the group or partition of data.
DataFrame.map_row()
  • Supported on VantageCloud Enterprise and VantageCore.
  • You want to execute a lambda function that returns numpy 1-D or 2-D arrays, pandas Series, or pandas DataFrames.
  • You want to apply a Python function to each teradataml DataFrame row and return the result as a teradataml DataFrame.
DataFrame.map_partition()
  • Supported on VantageCloud Enterprise and VantageCore.
  • You want to execute a lambda function that returns numpy 1-D or 2-D arrays, pandas Series, or pandas DataFrames.
  • You want to apply a Python function to a group or partition of rows in the teradataml DataFrame and return the result as a teradataml DataFrame.