1.0 - 8.00 - GLML1L2 Arguments - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)
FactorTable
[Optional] Specify the name for the output table that contains the result. The result is based on either CategoricalColumns or Randomization; therefore, you must also specify either CategoricalColumns or Randomization ('true').
You can use factor_table as InputTable for future GLML1L2 function calls, thereby saving the function from repeating the categorical-to-numerical conversion or randomization.
FeatureColumns
Specify the names of the InputTable columns that contain the variables to use as predictors (independent variables) in the model.
CategoricalColumns
[Optional] Specify the names of the InputTable columns that contain categorical variables, and which of their categories to use in the model.
categorical_column_and_categories Descriptions
'categorical_column:max_cardinality' Uses most common categories in categorical_column and groups other categories into category 'others'.

For example, 'column_a:3' specifies that for column_a, function uses 3 most common categories and sets category of rows that do not belong to those 3 categories to 'others'.

'categorical_column:(category [,...])' Uses specified categories of categorical_column and groups other categories into category 'others'.

For example, 'column_a : (red, yellow, blue)' specifies that for column_a, function uses categories red, yellow, and blue, and sets category of rows that do not belong to those categories to 'others'.

'categorical_column' Uses all categories in categorical_column.
If you use this argument, you must also specify the FactorTable argument, and in the FeatureColumns argument, you must specify each categorical_column.
Default behavior: The function treats all variables as numerical.
For information about columns that you must identify as categorical, see Identification of Categorical Columns.
Randomization
[Optional] Specify whether to randomize the InputTable data. If you use this argument, you must also specify the FactorTable argument.
Default: 'false'
ResponseColumn
Specify the name of the InputTable column that contains the responses.
Family
[Optional] Specify the distribution exponential family.
Default: 'GAUSSIAN'
Alpha
[Optional] Specify the mixing parameter for penalty computation (see the following table). The alpha must be in [0, 1]. If alpha is in (0,1), it represents α in the elastic net regularization formula in Generalized Linear Model Functions.
alpha Regularization Type Parameter Description
0 Ridge ½
(0,1) Elastic net
1 LASSO
Default: 0
Lambda
[Optional] Specify the parameter that controls the magnitude of the regularization term. A value of zero disables regularization.
Default: 0
StopThreshold
[Optional] Specify the convergence threshold. The threshold must be a nonnegative DOUBLE PRECISION value.
Default: 1.0e-7
MaxIterNum
[Optional] Specify the maximum number of iterations over the data. The parameter max_iterations must be a positive INTEGER value in the range [1, 100000].
Default: 10000