A JSON (JavaScript Object Notation) function descriptor is a JSON file that Teradata ML Engine uses for function metadata processing.
Major Sections of JSON Descriptor
Section |
Description |
Header |
Specifies function name, version, and type information. |
Input_tables |
Specifies function ON clauses. |
Argument_clauses |
Specifies function arguments. |
Header Section Fields
Field |
Type |
Description |
json_schema_major_version
|
string |
Major version of JSON schema. Set to "1". |
json_schema_minor_version
|
string |
Minor version of JSON schema. Set to "2". |
json_content_version
|
string |
JSON content version. Set to "1". |
function_name
|
string |
Name of function class file. |
function_version
|
string |
Version of function. |
function_type
|
string |
Specifies function type ("driver" or "non-driver"). |
short_description
|
string |
Short description of function. |
long_description
|
string |
Long description of function. |
Input_tables Section Fields
Field |
Type |
Description |
name
|
string |
Specifies ON clause alias. |
datatype
|
string |
Set to "TABLE_ALIAS" for each ON clause. |
requiredInputKind
|
list of string |
Specifies partition information of ON clause. Can be a combination of PartitionByKey, PartitionByAny, or Dimension. |
partitionByOne
|
boolean |
Specifies whether ON clause accepts PartitionByOne. For this to be true, requiredInputKind must be PartitionByKey. |
isOrdered
|
boolean |
Specifies whether ON clause accepts ORDER BY clause. |
isRequired
|
boolean |
Specifies whether ON clause is required. |
description
|
string |
Description of ON clause. |
Argument_clauses Section Fields
Field |
Type |
Description |
name
|
string |
Specifies argument name. |
datatype
|
string |
Specifies data type of argument value; one of these:
- "BOOLEAN"
- "INTEGER"
- "DOUBLE"
- "LONG"
- "STRING"
- "TABLE_NAME"
- "COLUMN_NAMES"
- "COLUMNS"
|
isRequired
|
boolean |
Specifies whether argument clause is required or optional. |
defaultValue
|
Boolean, numeric, or string depending on the value of the data type. |
Specifies default value of argument (value for function to use if the user omits argument). Specify only if isRequired is set to false. |
permittedValues
|
list of string |
Specifies permitted values of argument clause. |
description
|
string |
Description of argument clause. |
isOutputTable
|
boolean |
Specifies whether argument clause accepts database table as output. For this value to be true, data type must be set to "TABLE_NAME". |
JSON Descriptor Example: GMMFit Function
{
"json_schema_major_version": "1",
"json_schema_minor_version": "2",
"json_content_version": "1",
"function_name": "GMMFit",
"function_version": "1.2",
"function_type": "driver",
"short_description"; "Fits a Gaussian Mixture Model to data.",
"long_description": "Clusters data using a Gaussian Mixture Model or a Dirichlet Process Gaussian Mixture Model.",
"input_tables";[
{
"requiredInputKind":[
"PartitionByKey"
],
"isOrdered": false,
"partitionByOne": true,
"name": "init_params",
"isRequired": true,
"description": "Contains initial values for the cluster weights, means, and covariances.",
"datatype": "TABLE_ALIAS"
}
],
"argument_clauses":[
{
"isOutputTable": false,
"name":"InputTable",
"isRequired": true,
"description": "Specifies the name of the table that contains the input data to be clustered.",
"datatype": "TABLE_NAME"
},
{
"isOutputTable": true,
"name":"OutputTable",
"isRequired": true,
"description": "Specifies the name of the output table to which the function outputs cluster information.",
"datatype": "TABLE_NAME"
},
{
"defaultValue": 20,
"name": "MaxClusternum",
"isRequired": false,
"description": "Specifies the maximum number of clusters in a Dirichlet process model.",
"datatype": "INTEGER"
},
{
"permittedValues": [
"SPHERICAL",
"DIAGNONAL",
"FULL",
"TIED"
],
"defaultValue": "DIAGONAL",
"name": "CovarianceType",
"isRequired": false,
"description": "Specifies the type of the covariance matrices.",
"datatype": "STRING"
},
{
"defaultValue": 0.001,
"name": "Tolerance",
"isRequired": false,
"description": "Specifies the minimum change in log-likelihood between iterations that causes the function to terminate.",
"datatype": "DOUBLE"
},
{
"defaultValue": false,
"name": "PackOutput",
"isRequired": false,
"description": "Specifies whether the function packs the output. The default value is 'false'.",
"datatype": "BOOLEAN"
}
]
}