The sandbox_container_id configuration property indicates the Docker container that will be used by the test_script() method of the Script class. This property is automatically set to the container id of the container started when running setup_sandbox_env().
User script is run on the container indicated by this property in sandbox mode. The container is cleaned up by Garbage Collector at the end of a session and this property is set back to None.
- By default, it is set to None, which means a sandbox container is not started from within teradataml.
- When a sandbox container is started from within teradataml, this property is set to the id of that sandbox container.
- If a user wants to use any other container, this property needs to be manually set to that container id using configure.sandbox_container_id = <container id>.In this case, teradataml is not responsible for container cleanup at the end of a session.
Example
- Import the required modules.
>>> from teradataml.table_operators.sandbox_container_util import setup_sandbox_env >>> from teradataml.table_operators.Script import Script >>> from collections import OrderedDict >>> from teradatasqlalchemy import (VARCHAR)
- Check the configuration property value for teradataml.
>>> print("configure.sandbox_container_id is set to: {}".format(configure.sandbox_container_id)) configure.sandbox_container_id is set to: None
- Load sandbox image and start a container.
>>> setup_sandbox_env(sandbox_image_location="/tmp/sto_sandbox_image.tar.gz", sandbox_image_name="stosandbox:1.0") Loading image from /tmp/sto_sandbox_image.tar.gz. It may take few minutes. Image loaded successfully. Container abcd8211241e801ecb584e9382f2e581312f805fddcf8507f9c8731dfbffae38 started successfully.
- Check the configuration property value for teradataml, after a container is started.
>>> print("configure.sandbox_container_id is set to: {}".format(configure.sandbox_container_id)) configure.sandbox_container_id is set to: abcd8211241e801ecb584e9382f2e581312f805fddcf8507f9c8731dfbffae38
- Load example data.
>>> load_example_data("Script", ["barrier"])
- Create teradataml DataFrame objects.
>>> barrierdf = DataFrame.from_table("barrier")
- Create a Script object to run a script on Vantage.
The script "mapper.py" reads in a line of text input ("Old Macdonald Had A Farm") from csv and splits the line into individual words, emitting a new row for each word.
>>> sto = Script(data=barrierdf, script_name='mapper.py', files_local_path= 'data/scripts', script_command='python3 ./alice/mapper.py', data_order_column="Id", is_local_order=False delimiter=',', nulls_first=False, sort_ascending=False, charset='latin', returns=OrderedDict([("word", VARCHAR(15)), ("count_input", VARCHAR(2))]) )
- Run test_script() on the container created in step 3.
>>> sto.test_script(input_data_file='../barrier.csv') ############ STDOUT Output ############ word count_input 0 Macdonald 1 1 A 1 2 Farm 1 3 Had 1 4 Old 1 5 1 1
If the user has started a container outside teradataml with container_id '3a938ac820be' and wants to run script on that container:- Set configure.sandbox_container_id to the new container id.
>>> configure.sandbox_container_id = '3a938ac820be'
- Check the configuration property value for teradataml.
>>> print("configure.sandbox_container_id is set to: {}".format(configure.sandbox_container_id)) configure.sandbox_container_id is set to: 3a938ac820be
- Run test_script() on container with id '3a938ac820be'.
>>> sto.test_script(input_data_file='../barrier.csv') ############ STDOUT Output ############ word count_input 0 Macdonald 1 1 A 1 2 Farm 1 3 Had 1 4 Old 1 5 1 1
- Set configure.sandbox_container_id to the new container id.
- Exit the session.
>>> remove_context()
This cleans up the container created in step 3. - Check the configuration property value for teradataml again, after the container is cleaned up.
>>> print("configure.sandbox_container_id is set to: {}".format(configure.sandbox_container_id)) configure.sandbox_container_id is set to: None