UDCF Table Operator Example - Teradata VantageCloud Lake

Lake - Using Queries, UDFs, and External Stored Procedures

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-08-12
dita:mapPath
vgj1683671089901.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
vgj1683671089901

AWS Lambda Function

# encrypt_caesar_cipher.py
def enhanced_caesar_cipher(text, shift):
    """
    Encrypts a string using an improved Caesar cipher.

    Args:
        text: The string to be encrypted.
        shift: The number of positions to shift the characters.

    Returns:
        The encrypted string.
    """
    result = ""
    for char in text:
        # Encrypt all printable ASCII characters (32 to 126)
        if 32 <= ord(char) <= 126:
            result += chr((ord(char) - 32 + shift) % 95 + 32)
        else:
            # Keep non-printable characters as is
            result += char

    return result
def lambda_handler(event, context):
    """
    AWS Lambda handler function for encrypting multiple strings in batch.
    """
    encrypted=[]

    # iterate over each row in the batch
    for row in event["data"]:
        index = row[0] # the virtual index
        value = str(row[1]) # The first column in the ON clause

        # the result row contains the virtual index and the encrypted result
        encrypted.append([index, enhanced_caesar_cipher(value, 3)])

    return {'data': encrypted}

UDCF Execution using seller_name as first input argument

SELECT count(*) FROM CaesarEncrypt(
    ON (SELECT seller_name FROM INDB_freddiemac_rows.loan_info_1M)
) AS EN_DATA;
 Count(*)        1000000

The CaesarEncrypt table operator UDCF takes seller_name as the first input argument from the input table loan_info_1M. The lambda_handler function for the table operator also follows the Data Format but there can be multiple rows and multiple fields. The first element of the row is a virtual index value. In this example, the seller_name column is the second element of the row. There is no OUTPUT clause in the query, so the result set schema is the same as the input (a single character column of size 20). An equivalent query with an explicit OUTPUT clause looks like:

SELECT count(encrypted_seller_name) FROM CaesarEncrypt(
    ON (SELECT seller_name FROM INDB_freddiemac_rows.loan_info_1M)
    USING
    OUTPUT('[{"encrypted_seller_name":"char(20)"}]')
) AS EN_DATA;
 Count(*)        1000000

The columns in the ON clause get sent to the lambda_handler’s event["data"] in position order.

UDCF Execution using Encryption

In this example, multiple columns are encrypted and the result is joined back to the table using the unique id column.

First, modify the table operator so that it passes the ID field first and then the columns to be encrypted. This lets you pass the ID column back in the result and use it to join to the original table:

SELECT top 10 * FROM CaesarEncrypt(
    ON (SELECT cast(id as CHAR(6)), seller_name, loan_sequence_number
        FROM INDB_freddiemac_rows.loan_info_1M)
    USING
    OUTPUT('[{"id":"char(6)"}, {"encrypted_seller_name":"char(20)"}, {"encrypted_loan_sequence_number":"varchar(18)"}]')
) AS EN_DATA;

Add ID (a BIGINT) and loan_sequence_number to the ON clause input to the table operator. The ID is cast as a char(6) since the table operator currently does not support BIGINT types. The OUTPUT clause also has the return schema of the table operator.

Pass back the ID unmodified, the encrypted seller_name and then the encrypted loan_sequence_number.

Next, modify the AWS Lambda function to handle the new input and output constraints:

# encrypt_caesar_cipher.py
def enhanced_caesar_cipher(text, shift):
    """

    Encrypts a string using an improved Caesar cipher.

    Args:
        text: The string to be encrypted.
        shift: The number of positions to shift the characters.

    Returns:
        The encrypted string.
    """
    result = ""
    for char in text:
        # Encrypt all printable ASCII characters (32 to 126)
        if 32 <= ord(char) <= 126:
            result += chr((ord(char) - 32 + shift) % 95 + 32)
        else:
            # Keep non-printable characters as is
            result += char
    return result

def lambda_handler(event, context):
    """
    AWS Lambda handler function for encrypting multiple strings in batch.
    """
    encrypted=[]

    # iterate over each row in the batch
    for row in event["data"]:
        index = row[0]  # the virtual index
        id_col = row[1] # the id

        # build the current output row according to Data Format and OUTPUT clause
        output_row = [index, id_col] 

        # encrypt the rest of the columns in the ON clause
        for col in row[2:]:
            output_row.append(enhanced_caesar_cipher(str(col), 3))

        # the result row contains the virtual index and the encrypted result
        encrypted.append(output_row)

    return {'data': encrypted}

Finally, JOIN the result back to the original table using the unique ID column:

SELECT top 10 t.id, t.seller_name, et.encrypted_seller_name,  t.servicer_name, et.encrypted_servicer_name
FROM SELECT * FROM CaesarEncrypt(
    ON (SELECT cast(id as CHAR(6)), seller_name, servicer_name
        FROM INDB_freddiemac_rows.loan_info_1M)
    USING
    OUTPUT('[{"id":"char(6)"}, {"encrypted_seller_name":"char(20)"}, {"encrypted_servicer_name":"char(20)"}]')
) AS et
JOIN INDB_freddiemac_rows.loan_info_1M as t
ON et.id = t.id
ORDER BY 1;