This section explains how Teradata SQL Engine and Teradata Machine Learning Engine work together to process a SQL request.
SELECT * FROM kMeans ( ON (SELECT * FROM kmeanssample) as InputTable OUTPUT TABLE outputTable (‘kmeanssample_output’) ...
- Contract Phase: Teradata SQL Engine receives the SQL request
In the SQL Engine, the Parsing Engine (PE) performs the SQL parsing, and sends a metadata message through QueryGrid™ fabric to the Planner Parser in the ML Engine to discover metadata for function execution.
In the metadata message, in addition to the basic information about function name, function version, function type and function descriptions, and so on, it contains parameters specifying the input and output tables, and the input columns.
For the example KMeans function that is used to perform k-means clustering on a dataset, the metadata message defines the following tables:
- Required input table "InputTable" containing the list of features for clustering data
- Optional input table "CentroidsTable" containing the initial seed means for the clusters
- Required output table "OutputTable"in which to output the centroids of the clusters
- Optional output table "ClusteredOutput" in which to store the clustered output
Depending on the function, the metadata message may contain other parameters.
In the ML Engine, the Planner Parser creates input table based on the metadata parameters, specifically the column data types, and executes the function against them with specific options from the user. The generated output table with specific data types is then used to set the output table in the SQL Engine.
The Access Module Processors (AMPs) in the SQL Engine then set up connections to the vworkers in the ML Engine through the QueryGrid™ fabric, preparing to export data for analysis.
- Execute phase: Teradata SQL Engine exports table to Teradata ML Engine for processing
When the SQL Engine receives the function execution metadata from the ML Engine, the QueryGrid™ connector makes data type mapping and conversions, and initiates the export of kmeanssample table from AMPs through QueryGrid™ fabric to the vworkers in the ML Engine.
The ML Engine temporarily stores the data for function execution.
- Execute phase: Teradata ML Engine executes the KMeans SQL-MapReduce function
- Execute phase: Teradata SQL Engine imports the result table from Teradata ML Engine
Once the function execution is finished, the SQL Engine pulls the result table from the ML Engine through the QueryGrid™ fabric.
If additional tables are created as part of the execution output, the ML Engine notifies the SQL Engine, and exports the additional tables through separate SQL Engine sessions.