Run APPLY Table Operator | Open Analytics Framework | VantageCloud Lake - Running APPLY Table Operator - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The APPLY table operator invokes all AMPs on the compute cluster to start processes that runs the program specified by the APPLY_COMMAND clause. A PARTITION BY clause invokes a new process per partition so that your program only has to be scoped to processing a partition. If multiple partitions are processed by the same AMP, multiple processes are started by the same AMP to process each partition one at a time.

You can only run APPLY table operator queries in the presence of a user environment. The user environment is a concept about a scoped space controlled by the user to run the APPLY table operator in it, so no other operations are allowed on other environments that belong to other users.

Before running the APPLY table operator, you need to establish a connection to a target VantageCloud Lake system with your credentials and an authentication token. You typically receive this information from your system administrator.

The following is an example of APPLY_COMMAND invoking the Python interpreter to run a user script in the 'myscript.py' file within the user environment called 'myenvironment':

SELECT * FROM TD_SYSFNLIB.APPLY
(
   ON intab
   RETURNS (col1 int, col2 varchar(20))
   USING
   STYLE('csv')
   APPLY_COMMAND('python3 myscript.py')
   ENVIRONMENT ('myenvironment')
) as dt;
The APPLY table operator follows the following execution:
  1. By convention, the Analytics Database streams the ON clause input source data into the script through the script’s stdin. Correspondingly, script output is streamed through the script stdout back to the Analytics Database.
  2. The program acquires csv formatted rows in UTF-8 from stdin. Results are written to stdout ("print" in Python) and translated behind the scenes into the RETURNS clause types.
  3. The text included in the APPLY_COMMAND clause runs the user script inside the created user environment in the Open Analytics Framework.
  4. User environment contains libraries to access the Python interpreter with the APPLY table operator.