Arguments - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software
Argument Category Description
InputTable Required Specifies the name of the table that contains the columns described in the table in Input.
OutputTable Required Specifies the name for the output table of coefficients. This table must not exist. For GLM, the output is written to the screen, and the output table is the table where the coefficients are stored.
InputColumns Optional Specifies the name of the column that contains the dependent variables (Y) followed by the names of the columns that contain the predictor variables (Xi), in this format: 'Y,X1,X2,...,Xp'.

By default, the first column of the input table is Y and the remaining input table columns are Xi, except for the column specified by the Weight argument.

CategoricalColumns Optional Specifies columnname-value pairs, each of which contains the name of a categorical input column and the category values in that column that the function is to include in the model that it generates.

Each columnname-value pair has one these forms:

  • 'columnname:max_cardinality'

    Limits the categories in the column to the max_cardinality most common ones and groups the others together as 'others'. For example, 'column_a:3' specifies that for column_a, the function uses the 3 most common categories and sets the category of the rows that do not belong to those 3 categories to 'others'.

  • 'columnname:(category [, ...])'

    Limits the categories in the column to those that you specify and groups the others together as 'others'. For example, 'column_a : (red, yellow, blue)' specifies that for column_a, the function uses the categories red, yellow, and blue, and sets the category of the rows that do not belong to those categories to 'others'.

  • 'columnname'

    All category values appear in the model.

If you specify the ColumnNames argument, then the columns that you specify in the CategoricalColumns argument must also appear in the ColumnNames argument.

Family Optional Specifies the distribution exponential family. Supported values are:
  • 'BINOMIAL' (default)
  • 'LOGISTIC' (equivalent to 'BINOMIAL')
  • 'POISSON'
  • 'GAUSSIAN'
  • 'GAMMA'
  • 'INVERSE_GAUSSIAN'
  • 'NEGATIVE_BINOMIAL'
Link Optional Specifies the link function. The default value is 'CANONICAL'. The canonical link functions (default link functions) and the link functions that are allowed for each exponential family are listed in the table in Background.
Weight Optional Specifies the name of an input table column that contains the weights to assign to responses. By default, all observations have equal weight.

You can use non-NULL weights to indicate that different observations have different dispersions (with the weights being inversely proportional to the dispersions). Equivalently, when the weights are positive integers wi, each response yi is the mean of wi unit-weight observations. A binomial GLM uses prior weights to give the number of trials when the response is the proportion of successes. A Poisson GLM rarely uses weights.

If the weight is less than the response value, then the function throws an exception. Therefore, if the response value is greater than 1, you must specify a weight that is greater than or equal to the response value.

Threshold Optional Specify the convergence threshold. The default value is 0.01.
MaxIterNum Optional Specifies the maximum number of iterations that the algorithm runs before quitting if the convergence threshold has not been met. The parameter max_iterations must be a positive INTEGER value. The default value is 25.
Intercept Optional Specifies whether the function uses an intercept. For example, in ß0+ß1*X1+ß2*X2+ ....+ ßpXp, the intercept is ß0.The default value is 'true'.
Step Optional Specifies whether the function uses a step. The default value is 'false'. If the function uses a step, then it runs with the GLM model that has the lowest Akaike information criterion (AIC) score, drops one predictor from the current predictor group, and repeats this process until no predictor remains.