Use the test_script method to run script locally outside Vantage. Input data for user script is read from a csv file or from a table in the database.
- If called with input data file, test_script() function will simply read the data from the file and provide the data as input to the user script.
- There is no data partitioning when input is from a file.
- The test_script() function may produce different output if input is read from a file than input from database.
- input_data_file: Specifies the name of the input data file. It should have a path relative to the location specified in file_local_path.
If set to None, data is read from AMP; otherwise, data is from the file passed in the argument input_data_file.
The file should have at least permission of mode 644.
- supporting_files: Specifies a file or list of supporting files like model files to be copied to local system.
- script_args: Specifies command line arguments required by the user script.
- exec_mode: Specifies the mode in which user wants to test the script. When set to 'local', it will run locally on user's system.When 'local' execution mode is used, Teradata recommends having the same Python packages and versions installed on local machine as those installed on the Vantage cluster. Python packages installed on Vantage and the corresponding version information can be found using the db_python_package_details() function.
Permitted value is 'local'.
Default value is 'local'.
- **kwargs: Specifies the keyword arguments required for testing.Possible keys:
- data_row_limit: Specifies the number of rows to be taken from all AMPs when reading from Vantage.
It is ignored when data is read from file.
- password: Specifies the password to connect to Vantage where the data resides.
It is required when reading from database.
- data_file_delimiter: Specifies the delimiter used in the input data file.
It can be specified when data is read from file.
- data_file_header: Specifies whether the input data file contains header.
It can be specified when data is read from file.
- data_file_quote_char: Specifies the quotechar used in the input data file.
It can be specified when data is read from file.
- logmech: Specifies the type of logon mechanism to establish a connection to Vantage.Permitted values include:
- TD2: The Teradata 2 (TD2) mechanism provides authentication using a Vantage username and password.
This is the default logon mechanism using which the connection is established to Vantage.
- TDNEGO: This is a security mechanism that automatically determines the actual mechanism required, based on policy, without user's involvement. The actual mechanism is determined by the TDGSS server configuration and by the security policy's mechanism restrictions.
- LDAP: This is a directory-based user logon to Vantage with a directory username and password and is authenticated by the directory.
- KRB5 (Kerberos): This is a directory-based user logon to Vantage with a domain username and password and is authenticated by Kerberos (KRB5 mechanism).User must have a valid ticket-granting ticket in order to use this logon mechanism.
- JWT: The JSON Web Token (JWT) authentication mechanism enables single sign-on (SSO) to the Vantage after the user successfully authenticates to Teradata UDA User Service.User must use logdata parameter when using 'JWT' as the logon mechanism.
teradataml expects the client environments are already set up with appropriate security mechanisms and are in working conditions. See Teradata Vantage™ - Analytics Database Security Administration, B035-1100 for more information. - TD2: The Teradata 2 (TD2) mechanism provides authentication using a Vantage username and password.
- logdata: Specifies parameters to the LOGMECH command beyond those needed by the logon mechanism, such as user ID, password and tokens (in case of JWT) to successfully authenticate the user.When data is read from file, even if these arguments are passed, they will be ignored.
- data_row_limit: Specifies the number of rows to be taken from all AMPs when reading from Vantage.
For details of all arguments, see Teradata Package for Python Function Reference.
Example 1: Test script in local mode with input from table
In this example, the script "mapper.py" reads in a line of text input ("Old Macdonald Had A Farm") from csv and splits the line into individual words, emitting a new row for each word.
- Load example data.
>>> load_example_data("Script", ["barrier"])
- Import required packages.
>>> from collections import OrderedDict
>>> from teradatasqlalchemy import (VARCHAR)
- Create teradataml DataFrame objects.
>>> barrierdf = DataFrame.from_table("barrier")
- Create a Script object that allows user to execute script on Vantage.
>>> sto = Script(data=barrierdf, script_name='mapper.py', files_local_path= 'data/scripts', script_command='python3 ./<database name>/mapper.py', data_order_column="Id", is_local_order=False delimiter=',', nulls_first=False, sort_ascending=False, charset='latin', returns=OrderedDict([("word", VARCHAR(15)), ("count_input", VARCHAR(2))]) )
- Run user script in local mode with input from table.
>>> sto.test_script(data_row_limit=300, password='<password>', exec_mode='local') ############ STDOUT Output ############ word count_input 0 1 1 1 Old 1 2 Macdonald 1 3 Had 1 4 A 1 5 Farm 1