The teradataml Script function is an interface to the SCRIPT table operator (STO) object in the Analytics Database. See the SCRIPT section of Table Operators in Teradata Vantage™ - SQL Operators and User-Defined Functions, B035-1210 for details about SCRIPT table operator.
You can use Script function to execute natively in Vantage a user-installed script against data from an Analytics Database table via the SCRIPT table operator. To do this, the Teradata In-nodes Interpreter and Add-ons language packages must be installed in advance in the target Analytics Database nodes.
- script_command: specifies the command or script to run,.
- script_name: specifies the name of the user script.
- files_local_path specifies the absolute local path where user script and all supporting files like model files, input data file reside.
- returns: specifies output column definition, which is a dictionary specifying column-name to teradatasqlalchemy-type mapping.Users can pass a dictionary (dict or OrderedDict) to the returns argument, with the keys ordered to represent the order of the output columns. The preferred type is OrderedDict.
- data: specifies a teradataml DataFrame containing the input data for the script.
- data_hash_column: specifies the column to be used for hashing.
The rows in the data will be redistributed to AMPs based on the hash value of the column specified. The user-installed script file then runs once on each AMP. If there is no data_partition_column, then the entire result set, delivered by the function, constitutes a single group or partition.
data_hash_column can not be specified along with data_partition_column, is_local_order and data_order_column. - data_partition_column: specifies Partition By columns for data.
- data_partition_column can not be specified along with data_hash_column.
- data_partition_column can not be specified along with "is_local_order = True".
- is_local_order: specifies a boolean value to determine whether the input data is to be ordered locally or not.
Default value is False. When set to True, data is ordered locally.
- This argument is ignored, if data_order_column is None.
- is_local_order can not be specified along with data_hash_column.
- When is_local_order is set to 'True', specify data_order_column, and the columns specified in data_order_column are used for local ordering.
- data_order_column: specifies the Order By column for data, and can be used no matter is_local_order is set to 'True' or 'False'.data_order_column can not be specified along with data_hash_column.
- sort_ascending: specifies a boolean value to determine if the result set is to be sorted on the data_order_column column in ascending or descending order.
When set to the default value 'True', the sorting is ascending. When set to 'False', the sortin is descending.
This argument is ignored, if data_order_column is None. - nulls_first: specifies a boolean value to determine whether NULLS are listed first or last during ordering.
NULLS are listed first when this argument is set to 'True', and last when set to 'False'.
This argument is ignored, if data_order_column is None. - delimiter: specifies a delimiter to use when reading columns from a row and writing result columns.
The default value is '\t' (tab).
- This argument cannot be the same as quotechar argument.
- This argument cannot be a newline character, which is '\\n'.
- auth: specifies an authorization to use when running the script.
- charset: specifies the character encoding for data, could be either 'utf16' or 'latin'.
- quotechar: specifies a character that forces all input and output of the script to be quoted using this specified character.
Using this argument enables the Analytics Database to distinguish between NULL fields and empty strings. A string with length zero is quoted, while NULL fields are not.
If this character is found in the data, it will be escaped by a second quote character.
- This argument cannot be the same as delimiter argument.
- This argument cannot be a newline character, which is '\\n'.
For details of all arguments, see Teradata Package for Python Function Reference.
Required permissions
- Teradata user needs permission to execute Script Table Operator and the default authentication.
GRANT EXECUTE FUNCTION ON td_sysfnlib.script TO TDAPUSER;
GRANT EXECUTE ON SYSUIF.DEFAULT_AUTH to TDAPUSER;
- Additional permissions required to run Script on the database.
GRANT SELECT ON TDAPUSERDB TO TDAPUSER;
- Additional permissions required to install, remove and replace files.
GRANT EXECUTE PROCEDURE ON SYSUIF.INSTALL_FILE TO TDAPUSER;
GRANT EXECUTE PROCEDURE ON SYSUIF.REPLACE_FILE TO TFDAPUSER;
GRANT EXECUTE PROCEDURE ON SYSUIF.REMOVE_FILE TO TDAPUSER;
- You must set SEARCHUIFDBPATH prior to accessing any installed scripts in the target database, so that the Linux system user 'tdatuser' can find the scripts and run them through the STO. If the scripts reside in 'TDAPUSERDB', then set search path in corresponding Python script or Python console as follows:
get_connection().execute("SET SESSION SEARCHUIFDBPATH = TDAPUSERDB;")
- If the user ('me' in the example here) is using another database ('TDAPUSERDB' in the example here) as default database, the user must have the execute permissions on the Script function to the default database.
Here is the quick example.
- Create the required users and databases as database administrator.
-- Create a user 'sysdba' with default database as 'sysdba'. CREATE USER sysdba FROM DBC AS PERM = 14e9 PASSWORD = sysdba DEFAULT DATABASE = sysdba NO BEFORE JOURNAL NO AFTER JOURNAL; -- Create a new database from the newly created user 'sysdba'. CREATE DATABASE TDAPUSERDB FROM sysdba AS PERM=1e9 NO BEFORE JOURNAL NO AFTER JOURNAL; -- Create a user with default database as newly created database 'TDAPUSERDB'. CREATE USER me FROM sysdba AS PERM = 100e6 PASSWORD = me DEFAULT DATABASE = TDAPUSERDB NO BEFORE JOURNAL NO AFTER JOURNAL;
- Make sure the file 'mapper.py' is installed in the database 'TDAPUSERDB'. Refer to the Database Utility section for how to use install_file() to install file in Vantage. Ignore if the file is already present in Vantage.
- The database administrator must provide the following permissions to the user 'me', if not already provided.
-- Permissions to the user 'me'. GRANT EXECUTE FUNCTION ON td_sysfnlib.script TO me; GRANT EXECUTE ON SYSUIF.DEFAULT_AUTH to me; -- Additional permissions on the database 'TDAPUSERDB'. GRANT CREATE TABLE ON TDAPUSERDB TO me; -- Needed if the table and required permissions are not present. GRANT CREATE VIEW ON TDAPUSERDB TO me; -- Needed if the permissions for view creation is not present. GRANT SELECT ON TDAPUSERDB TO me;
- Set the search path and run the Script function as the user 'me'.
-- Set the search path. SET SESSION SEARCHUIFDBPATH = TDAPUSERDB; -- Create view on top of Script function. -- Note that view creation is successful. create view vw3 as SELECT * FROM Script( ON "TDAPUSERDB"."usecase5" AS "input" PARTITION BY ANY SCRIPT_COMMAND('Rscript ./TDAPUSERDB/mapper.R') delimiter(',') returns('oc1 VARCHAR(10), oc2 VARCHAR(10)') ) as sqlmr; *** View has been created. *** Total elapsed time was 1 second. -- Seeing the contents of the view is not possible as the user 'me' don't have execute permissions on TD_SYSFNLIB.SCRIPT and default auth i.e., SYSUIF.DEFAULT_AUTH. select * from vw3; *** Failure 3523 An owner referenced by user does not have EXECUTE FUNCTION WITH GRANT OPTION access to TD_SYSFNLIB.SCRIPT.
- The user 'me' has to be given the following permissions, for the SELECT on view 'vw3' to work in step 4.
GRANT EXECUTE FUNCTION ON td_sysfnlib.script TO TDAPUSERDB WITH GRANT OPTION;
GRANT EXECUTE ON SYSUIF.DEFAULT_AUTH TO TDAPUSERDB WITH GRANT OPTION;