udf() Examples - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Example 1: Create a UDF to add the data in column 'Jan' with column 'Feb' and store result in Integer type column

>>> from teradatasqlalchemy.types import INTEGER
>>> from teradataml.dataframe.functions import udf
>>> @udf(returns=INTEGER()) 
... def sum(x, y):
...     if not x:
...         x = 0    
...     return x + y
>>>
>>> # Assign the Column Expression returned by user defined function
>>> # to the DataFrame.
>>> res = df.assign(len_sum = sum('Jan', 'Feb'))
>>> res
              Feb    Jan    Mar    Apr  datetime  len_sum
accounts                                                 
Alpha Co    210.0  200.0  215.0  250.0  17/01/04      410
Blue Inc     90.0   50.0   95.0  101.0  17/01/04      140
Yellow Inc   90.0    NaN    NaN    NaN  17/01/04       90
Jones LLC   200.0  150.0  140.0  180.0  17/01/04      350
Orange Inc  210.0    NaN    NaN  250.0  17/01/04      210
Red Inc     200.0  150.0  140.0    NaN  17/01/04      350

Example 2: Create a function to get the values in 'accounts' to upper case and pass it to udf() as parameter to UDF

>>> from teradataml.dataframe.functions import udf
>>> def to_upper(s):
...     if s is not None:
...         return s.upper()
>>> upper_case = udf(to_upper)
>>>
>>> # Assign the Column Expression returned by user defined function
>>> # to the DataFrame.
>>> res = df.assign(upper_stats = upper_case('accounts'))
>>> res
            Feb    Jan    Mar    Apr  datetime upper_stats
accounts                                                    
Alpha Co    210.0  200.0  215.0  250.0  17/01/04    ALPHA CO
Blue Inc     90.0   50.0   95.0  101.0  17/01/04    BLUE INC
Yellow Inc   90.0    NaN    NaN    NaN  17/01/04  YELLOW INC
Jones LLC   200.0  150.0  140.0  180.0  17/01/04   JONES LLC
Orange Inc  210.0    NaN    NaN  250.0  17/01/04  ORANGE INC

Example 3: Create a UDF to add 4 to the 'datetime' column and store the result in DATE type column

While working on date and time data types one must format these to supported formats. See Requisite Input and Output Structures in Open Analytics Framework for more details.

>>> from teradataml.dataframe.functions import udf
>>> from teradatasqlalchemy.types import DATE
>>> @udf(returns=DATE())
... def add_date(x, y):
...     import datetime
...     return (datetime.datetime.strptime(x, "%y/%m/%d")+datetime.timedelta(y)).strftime("%y/%m/%d")
>>>
>>> # Assign the Column Expression returned by user defined function
>>> # to the DataFrame.
>>> res = df.assign(new_date = add_date('datetime', 4))
>>> res
                Feb    Jan    Mar    Apr  datetime  new_date
accounts                                                  
Alpha Co    210.0  200.0  215.0  250.0  17/01/04  17/01/08
Blue Inc     90.0   50.0   95.0  101.0  17/01/04  17/01/08
Jones LLC   200.0  150.0  140.0  180.0  17/01/04  17/01/08
Orange Inc  210.0    NaN    NaN  250.0  17/01/04  17/01/08
Yellow Inc   90.0    NaN    NaN    NaN  17/01/04  17/01/08
Red Inc     200.0  150.0  140.0    NaN  17/01/04  17/01/08

Example 4: Create a user defined function 'to_upper' to get values in 'accounts' column to upper case using a UDF that runs on non default environment

Create a Python 3.10.5 environment with given name and description in Analytics Database.

>>> env = create_env('test_udf', 'python_3.10.', 'Test environment for UDF')
User environment 'test_udf' created.
>>>
>>> # Create a user defined functions to 'to_upper' to get the values in upper case 
>>> # and pass the user env to run it on.
>>> from teradataml.dataframe.functions import udf
>>> @udf(env_name = env)
... def to_upper(s):
...     if s is not None:
...         return s.upper()
>>>
>>> # Assign the Column Expression returned by user defined function
>>> # to the DataFrame.
>>> df.assign(upper_stats = to_upper('accounts'))
            Feb    Jan    Mar    Apr  datetime upper_stats
accounts                                                    
Alpha Co    210.0  200.0  215.0  250.0  17/01/04    ALPHA CO
Blue Inc     90.0   50.0   95.0  101.0  17/01/04    BLUE INC
Yellow Inc   90.0    NaN    NaN    NaN  17/01/04  YELLOW INC
Jones LLC   200.0  150.0  140.0  180.0  17/01/04   JONES LLC
Orange Inc  210.0    NaN    NaN  250.0  17/01/04  ORANGE INC
Red Inc     200.0  150.0  140.0    NaN  17/01/04     RED INC

Example 5: Create a UDF with required functions inside the UDF itself

Define a function 'inner_add_date' inside the UDF to create a date object by passing year, month, and day and add 1 to that date. Call this function inside the UDF.

>>> from teradataml.dataframe.functions import udf
>>> @udf
... def add_date(y,m,d):
...     import datetime
...     def inner_add_date(y,m,d):
...         return datetime.date(y,m,d) + datetime.timedelta(1)
...     return inner_add_date(y,m,d)
>>> # Assign the Column Expression returned by user defined function
>>> # to the DataFrame.
>>> res = df.assign(new_date = add_date(2021, 10, 5))
>>> res
            Feb    Jan    Mar    Apr  datetime    new_date
accounts                                                    
Jones LLC   200.0  150.0  140.0  180.0  17/01/04  2021-10-06
Blue Inc     90.0   50.0   95.0  101.0  17/01/04  2021-10-06
Yellow Inc   90.0    NaN    NaN    NaN  17/01/04  2021-10-06
Orange Inc  210.0    NaN    NaN  250.0  17/01/04  2021-10-06
Alpha Co    210.0  200.0  215.0  250.0  17/01/04  2021-10-06
Red Inc     200.0  150.0  140.0    NaN  17/01/04  2021-10-06