Array Method | Teradata Package for Python - Array Method - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2026-02-20
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Use the Array method to create array objects that can be used with the DataFrame Assign method.

To use array utilities, you must have either the UDTTYPE or UDTMETHOD privilege on the SYSUDTLIB database. Use the following commands to grant the required permissions:

GRANT UDTMETHOD ON SYSUDTLIB TO <USER>;
GRANT SELECT ON UDTInfo TO <USER>;

Required Parameter

elements
Specifies the elements of similar Teradata types or literal values or numpy arrays to store in an array.
If an empty tuple is provided, the atype defaults to ARRAY_VARCHAR with scope 100.

Optional Parameters

atype
Specifies the Teradata type of the array elements.

If the argument is not specified, then teradataml infers the type based on the first element of the array.

default_null
Specifies whether the array type prepopulates the missing values with NULL or not. When set to True, missing values are populated with NULL values, otherwise they are undefined.

Default value: False

Example setup

>>> from teradataml import load_example_data, Array, DataFrame
>>> from teradatasqlalchemy.types import ARRAY_NUMBER, ARRAY_TIMESTAMP
>>> import datetime

Load the 'sales' data and create a DataFrame.

>>> load_example_data("dataframe", "sales")
>>> df = DataFrame("sales")

Output

             Feb   Jan   Mar   Apr   datetime
accounts
Orange Inc 210.0   NaN   NaN 250.0 04/01/2017
Blue Inc    90.0  50.0  95.0 101.0 04/01/2017
Jones LLC  200.0 150.0 140.0 180.0 04/01/2017
Alpha Co   210.0 200.0 215.0 250.0 04/01/2017
Yellow Inc  90.0   NaN   NaN   NaN 04/01/2017
Red Inc    200.0 150.0 140.0   NaN 04/01/2017

Example 1: Create an Array with literal values and add it as a new column to the DataFrame

>>> arr = Array((1, 2, 3))
>>> arr
Array [elements=(1, 2, 3), atype=ARRAY_INTEGER, default_null=False]

Create 'arr_col' column using Array object.

>>> res = df.assign(arr_col = arr)
>>> res

Output

             Feb   Jan   Mar   Apr   datetime arr_col
accounts
Red Inc    200.0 150.0 140.0   NaN 04/01/2017 (1,2,3)
Blue Inc    90.0  50.0  95.0 101.0 04/01/2017 (1,2,3)
Jones LLC  200.0 150.0 140.0 180.0 04/01/2017 (1,2,3)
Alpha Co   210.0 200.0 215.0 250.0 04/01/2017 (1,2,3)
Yellow Inc  90.0   NaN   NaN   NaN 04/01/2017 (1,2,3)
Orange Inc 210.0   NaN   NaN 250.0 04/01/2017 (1,2,3)
>>> res.tdtypes
accounts VARCHAR(length=20, charset='LATIN')
Feb                                  FLOAT()
Jan                                 BIGINT()
Mar                                 BIGINT()
Apr                                 BIGINT()
datetime                              DATE()
arr_col               ARRAY_INTEGER('[100]')

Example 2: Create an Array with DataFrame columns with atype and default_null as True and add it as a new column to the DataFrame

>>> arr = Array((df.Jan, df.Feb, df.Mar), atype=ARRAY_NUMBER('[5]'), default_null=True)
>>> arr

Output

Array [elements=(<teradataml.dataframe.sql._SQLColumnExpression object at 0x00000194F3E9BE80>, <teradataml.dataframe.sql._SQLColumnExpression object at 0x00000194F3E9B610>, <teradataml.dataframe.sql._SQLColumnExpression object at 0x00000194F3E9B1F0>), atype=ARRAY_NUMBER, default_null=True]

Create 'arr_col' column using Array object.

>>> res = df.assign(arr_col = arr)
>>> res

Output

             Feb   Jan   Mar   Apr   datetime                  arr_col
accounts
Yellow Inc  90.0   NaN   NaN   NaN 04/01/2017  (NULL,90,NULL,NULL,NULL)
Jones LLC  200.0 150.0 140.0 180.0 04/01/2017   (150,200,140,NULL,NULL)
Alpha Co   210.0 200.0 215.0 250.0 04/01/2017   (200,210,215,NULL,NULL)
Red Inc    200.0 150.0 140.0   NaN 04/01/2017   (150,200,140,NULL,NULL)
Blue Inc    90.0  50.0  95.0 101.0 04/01/2017      (50,90,95,NULL,NULL)
Orange Inc 210.0   NaN   NaN 250.0 04/01/2017 (NULL,210,NULL,NULL,NULL)
>>> res.tdtypes

Output

accounts   VARCHAR(length=20, charset='LATIN')
Feb                                    FLOAT()
Jan                                   BIGINT()
Mar                                   BIGINT()
Apr                                   BIGINT()
datetime                                DATE()
arr_col ARRAY_NUMBER('[5]', default_null=True)

Example 3: Creating an Array to store timestamp-type values and adding it as a new column to the DataFrame

>>> arr = Array((datetime.datetime(2024, 6, 20, 12, 0, 52), datetime.datetime(2025, 6, 2, 15, 32)), atype=ARRAY_TIMESTAMP('[1:2]', timezone=True))
>>> arr

Output

Array [elements=(datetime.datetime(2024, 6, 20, 12, 0, 52), datetime.datetime(2025, 6, 2, 15, 32)), atype=ARRAY_TIMESTAMP, default_null=False]

Create 'arr_col' column using Array object.

>>> res = df.assign(arr_col = arr)
>>> res

Output

              Feb   Jan   Mar   Apr   datetime                                                            arr_col
accounts
Alpha Co    210.0 200.0 215.0 250.0 04/01/2017 (2024-06-20 12:00:52.000000-00:00,2025-06-02 15:32:00.000000-00:00)
Jones LLC   200.0 150.0 140.0 180.0 04/01/2017 (2024-06-20 12:00:52.000000-00:00,2025-06-02 15:32:00.000000-00:00)
Blue Inc     90.0  50.0  95.0 101.0 04/01/2017 (2024-06-20 12:00:52.000000-00:00,2025-06-02 15:32:00.000000-00:00)
Orange Inc  210.0   NaN  NaN  250.0 04/01/2017 (2024-06-20 12:00:52.000000-00:00,2025-06-02 15:32:00.000000-00:00)
Yellow Inc   90.0   NaN   NaN   NaN 04/01/2017 (2024-06-20 12:00:52.000000-00:00,2025-06-02 15:32:00.000000-00:00)
Red Inc     200.0 150.0 140.0   NaN 04/01/2017 (2024-06-20 12:00:52.000000-00:00,2025-06-02 15:32:00.000000-00:00)
>>> res.tdtypes

Output

accounts    VARCHAR(length=20, charset='LATIN')
Feb                                     FLOAT()
Jan                                    BIGINT()
Mar                                    BIGINT()
Apr                                    BIGINT()
datetime                                 DATE()
arr_col ARRAY_TIMESTAMP('[1:2]', timezone=True)