test_script | Script Method | Teradata Python Package - test_script - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Published
November 2021
Language
English (United States)
Last Update
2022-01-14
dita:mapPath
bol1585763678431.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
B700-4006
lifecycle
previous
Product Category
Teradata Vantage

Use the test_script method to run script in docker container environment outside Vantage. Input data for user script is read from csv file or from table.

Required arguments:
  • input_data_file: Specifies the absolute local path of input data file.

    If set to None, data is read from AMP, else data is from the file passed in the argument input_data_file.

Optional arguments:
  • supporting_files: Specifies a file or list of supporting files like model files to be copied to the container.
  • script_args: Specifies command line arguments required by the user script.
  • exec_mode: Specifies the mode in which user wants to test the script.

    If set to 'sandbox', which is the default value, the user script will run within the sandbox environment.

    If set to 'local', it will run locally on user's system.
    When 'local' execution mode is used, Teradata recommends having the same Python packages and versions installed on local machine as those installed on the Vantage cluster. Python packages installed on Vantage and the corresponding version information can be found using the db_python_package_details() function.
  • **kwargs: Specifies the keyword arguments required for testing.
    Possible keys:
    • data_row_limit: Specifies the number of rows to be taken from all AMPs when reading from Vantage.

      It is ignored when data is read from file.

    • password: Specifies the password to connect to Vantage where the data resides.

      It is required when reading from database.

    • data_file_delimiter: Specifies the delimiter used in the input data file.

      It can be specified when data is read from file.

    • data_file_header: Specifies whether the input data file contains header.

      It can be specified when data is read from file.

    • timeout: Specifies the timeout for docker API calls when running in sandbox mode.
    • data_file_quote_char: Specifies the quotechar used in the input data file.

      It can be specified when data is read from file.

    • logmech: Specifies the type of logon mechanism to establish a connection to Vantage. It can be specified only when data is read from AMP and execution mode is 'sandbox'.
      Permitted values include:
      • TD2: The Teradata 2 (TD2) mechanism provides authentication using a Vantage username and password. This is the default logon mechanism using which the connection is established to Vantage.
      • TDNEGO: This is a security mechanism that automatically determines the actual mechanism required, based on policy, without user's involvement. The actual mechanism is determined by the TDGSS server configuration and by the security policy's mechanism restrictions.
      • LDAP: This is a directory-based user logon to Vantage with a directory username and password and is authenticated by the directory.
      • KRB5 (Kerberos): This is a directory-based user logon to Vantage with a domain username and password and is authenticated by Kerberos (KRB5 mechanism).
        User must have a valid ticket-granting ticket in order to use this logon mechanism.
      • JWT: The JSON Web Token (JWT) authentication mechanism enables single sign-on (SSO) to the Vantage after the user successfully authenticates to Teradata UDA User Service.
        User must use logdata parameter when using 'JWT' as the logon mechanism.
      teradataml expects the client environments are already set up with appropriate security mechanisms and are in working conditions. See Teradata Vantage™ - Advanced SQL Engine Security Administration, B035-1100 for more information.
    • logdata: Specifies parameters to the LOGMECH command beyond those needed by the logon mechanism, such as user ID, password and tokens (in case of JWT) to successfully authenticate the user.
      When data is read from file, even if these arguments are passed, they will be ignored.

For details of all arguments, see Teradata Package for Python Function Reference.

Example 1: Test script in sandbox environment and run script on Vantage

This example shows a workflow where user creates a Script object, tests script on sandbox environment, installs Python script on Vantage, runs Python script on Vantage and then removes the script from Vantage.

The script "mapper.py" reads in a line of text input ("Old Macdonald Had A Farm") from csv and splits the line into individual words, emitting a new row for each word.

By running user script locally within docker container and using data from csv, it helps the user to fix script level issues outside Vantage.

  1. Load example data.
    >>> load_example_data("Script", ["barrier"])
  2. Import required packages.
    >>> from collections import OrderedDict
    >>> from teradatasqlalchemy import (VARCHAR)
  3. Create teradataml DataFrame.
    >>> barrierdf = DataFrame.from_table("barrier")
  4. Create a Script object.
    >>> sto = Script(data=barrierdf,
                script_name='mapper.py',
                files_local_path= 'data/scripts',
                script_command='python3 ./<database name>/mapper.py',
                data_order_column="Id",
                is_local_order=False
                delimiter=',',
                nulls_first=False,
                sort_ascending=False,
                charset='latin', returns=OrderedDict([("word", VARCHAR(15)), ("count_input", VARCHAR(2))])
                )
  5. Setup the sandbox environment by providing local path to the Docker image file.
    >>> sto.setup_sto_env(docker_image_location='/tmp/sto_sandbox_docker_image.tar'))
    Loading image from /tmp/sto_sandbox_docker_image.tar. It may take few minutes.
    Image loaded successfully.
    Starting a container for stosandbox:1.0 image.
    Container c1dd4d4b722cc54b643ab2bdc57540a3a3e6db98c299defc672227de97d2c345 started successfully.
  6. Run user script in sandbox mode with input from data file.
    >>> sto.test_script(input_data_file='../barrier.csv',
    ...                 data_file_delimiter=',',
    ...                 data_file_quote_char='"',
    ...                 data_file_header=True,
    ...                 exec_mode='sandbox')
    ############ STDOUT Output ############
            word  count_input
    0          1            1
    1        Old            1
    2  Macdonald            1
    3        Had            1
    4          A            1
    5       Farm            1
    Script results look good.
  7. Install the user script file on Vantage.
    >>> sto.install_file(file_identifier='mapper', file_name='mapper.py', is_binary=False)
  8. Set the search path to the database where the file is installed.
    >>> get_context().execute("SET SESSION SEARCHUIFDBPATH = <database name>;")
  9. Run the user script on Vantage.
    >>> sto.execute_script()
    ############ STDOUT Output ############
     
            word count_input
    0  Macdonald           1
    1          A           1
    2       Farm           1
    3        Had           1
    4        Old           1
    5          1           1
  10. Remove the installed file from Vantage.
    >>> sto.remove_file(file_identifier='mapper', force_remove=True)

Example 2: Test script in local mode with input from table

In this example, the script "mapper.py" reads in a line of text input ("Old Macdonald Had A Farm") from csv and splits the line into individual words, emitting a new row for each word.

  1. Load example data.
    >>> load_example_data("Script", ["barrier"])
    
  2. Import required packages.
    >>> from collections import OrderedDict
    >>> from teradatasqlalchemy import (VARCHAR)
  3. Create teradataml DataFrame objects.
    >>> barrierdf = DataFrame.from_table("barrier")
  4. Create a Script object that allows user to execute script on Vantage.
    >>> sto = Script(data=barrierdf,
                script_name='mapper.py',
                files_local_path= 'data/scripts',
                script_command='python3 ./<database name>/mapper.py',
                data_order_column="Id",
                is_local_order=False
                delimiter=',',
                nulls_first=False,
                sort_ascending=False,
                charset='latin', returns=OrderedDict([("word", VARCHAR(15)), ("count_input", VARCHAR(2))])
                )
    
  5. Run user script in local mode with input from table.
    >>> sto.test_script(data_row_limit=300, password='<password>', exec_mode='local')
    ############ STDOUT Output ############
            word  count_input
    0          1            1
    1        Old            1
    2  Macdonald            1
    3        Had            1
    4          A            1
    5       Farm            1

Example 3: Test script in sandbox mode with different logon mechanisms

In this example, the script "mapper.py" reads in a line of text input ("Old Macdonald Had A Farm") from csv and splits the line into individual words, emitting a new row for each word.

  1. Load example data.
    >>> load_example_data("Script", ["barrier"])
    
  2. Import required packages.
    >>> from collections import OrderedDict
    >>> from teradatasqlalchemy import (VARCHAR)
  3. Create teradataml DataFrame objects.
    >>> barrierdf = DataFrame.from_table("barrier")
  4. Create a Script object that allows user to execute script on Vantage.
    >>> sto = Script(data=barrierdf,
                script_name='mapper.py',
                files_local_path= 'data/scripts',
                script_command='python3 ./<database name>/mapper.py',
                data_order_column="Id",
                is_local_order=False
                delimiter=',',
                nulls_first=False,
                sort_ascending=False,
                charset='latin', returns=OrderedDict([("word", VARCHAR(15)), ("count_input", VARCHAR(2))])
                )
    
  5. Run user script in sandbox mode with different logmech.
    • Run user script in sandbox mode with logmech as 'TD2'.
      >>> sto.test_script(script_args="4 5 10 6 480", password="<password>", logmech="TD2")
      
    • Run user script in sandbox mode with logmech as 'TDNEGO'.
      >>> sto.test_script(script_args="4 5 10 6 480", password="<password>", logmech="TDNEGO")
      
    • Run user script in sandbox mode with logmech as 'LDAP'.
      >>> sto.test_script(script_args="4 5 10 6 480", password="<password>", logmech="LDAP")
      
    • Run user script in sandbox mode with logmech as 'KRB5'.
      >>> sto.test_script(script_args="4 5 10 6 480", password="<password>", logmech="KRB5")
      
    • Run user script in sandbox mode with logmech as 'JWT'.
      >>> sto.test_script(script_args="4 5 10 6 480", password="<password>", logmech='JWT', logdata='token=eyJpc...h8dA')