Introduction to VALIB functions |
Vantage Analytics Library provides the Data Scientists and other users with over 50 advanced
analytic functions built directly in the Advanced SQL Engine, which is a core capability of
Teradata Vantage. These functions support the entire data science process, including exploratory
data analysis, data preparation and feature engineering, hypothesis testing, as well as
statistical and machine learning model building and scoring.
The following are the pre-requisites for running VALIB functions through teradataml:
1. Install the Vantage Analytic Library in Teradata Vantage's Advanced SQL Engine. The library
and readme file are available here for download.
2. In order to execute the VALIB functions related to Statistical Tests, the Statistical Test
Metadata tables must be loaded into a database on the system to be analyzed. This can be done
with the help of Vantage Analytic Library installer. The Statistical Test functions provide a
parameter called "stats_database" that can be used to specify the database in which these
tables are installed.
Once the setup is done, the user is ready to use Vantage Analytic Library functions from
teradataml. To execute Vantage Analytic Library functions,
1. Import "valib" object from teradataml as
from teradataml import valib
2. Set 'configure.val_install_location' to the database name where Vantage Analytics Library
functions are installed. For example,
from teradataml import configure
configure.val_install_location = "SYSLIB"
# SYSLIB is the database name where Vantage Analytics Library functions are installed.
3. Datasets used in the teradataml VALIB functions' examples are loaded with Vantage Analytics
Library installer.
Properties of VALIB function output object:
1. All VALIB functions return an object of class <VALIB_function> (say valib_obj).
2. The following are the attributes of the VALIB function object:
a. The output teradataml DataFrames, which can be accessed as valib_obj.<output_df_name>.
Details of the name(s) of the output DataFrame(s) can be found in Teradata Python
Function Reference Guide for each individual function. The tables corresponding to
output DataFrames are garbage collected at the end when the connection is closed.
Users must use copy_to_sql() or DataFrame.to_sql() function to persist the output
tables.
b. Input arguments that are passed to the function. Users can access all input arguments as
valib_obj.<input_argument_x>.
c. show_query() function to print the underlying VALIB Stored Procedure call and can be
accessed using valib_obj.show_query().