Generalized Linear Model (GLM) Functions (ML Engine) - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

The GLM, GLML1L2, and GLMPerSegment functions perform linear regression analysis for distribution functions using a user-specified distribution family. Their output is input to the GLMPredict_MLE, GLML1L2Predict, and GLMPredictPerSegment functions (respectively), which perform generalized linear model prediction on new input data.

The GLM, GLML1L2, and GLMPerSegment functions differ in these ways:

Function Description Supported Distribution Families Supported Regularization Models Output Tables
GLM (ML Engine) Unbiased ordinary least square estimator.

Builds single model.

See Supported Family/Link Function Combinations None
  • Model table
GLML1L2 (ML Engine) Biased estimator based on regularization.

Builds single model.

Binomial, Gaussian Ridge, LASSO, and elastic net
  • Model table
  • [Optional] Factor table
GLMPerSegment (ML Engine) Biased estimator based on regularization.

For partitioned input table, creates model for each partition.

Binomial, Gaussian Ridge, LASSO, and elastic net Model table

Regularization

Regularization is a technique for reducing overfitting and thus decreasing the variance of trained models. GLM functions are fit by minimizing a loss function, such as the sum of squared errors. For example, given a predictor vector X ϵ p, a response variable Y ϵ , and N observation pairs, you can find model parameters β0 and β with this formula:


Formula for finding model parameters β0 and β. Used by Machine Learning Engine generalized linear model functions.

These fits can be regularized by adding a penalty function P(β ) to the loss function being minimized. For example:


Formula that adds penalty to loss function being minimized. Used by Machine Learning Engine generalized linear model functions.

where λ controls the strength of the penalty function.

For logistic regression, the loss function is based on the log likelihood, as follows:


Formula for loss function based on the log likelihood. Used by Machine Learning Engine generalized linear model functions.
These are three popular penalty functions:
  • The sum of the absolute values of the model parameters:

    Formula for penalty function. Equals sum of absolute values of model parameters. Used by Machine Learning Engine generalized linear model functions.

    which is the L1 norm of the model parameters. This regularization technique, also called Least Absolute Shrinkage and Selection Operator (LASSO), was introduced by Robert Tibshirani in 1996. LASSO has the potential to shrink some parameters to zero; therefore, you can also use it for variable selection.

  • The sum of the squared values of the model parameters:

    Formula for penalty function. Equals squared values of model parameters. Used by Machine Learning Engine generalized linear model functions.

    which is the L2 norm of the model parameters. This regularization technique is also called ridge regression. With ridge regression, parameter values become smaller as λ increases, but never reach zero.

  • Elastic net regularization, which is a linear combination of L1 and L2 normalization:

    Formula for penalty function elastic net regularization. Used by Machine Learning Engine generalized linear model functions.

References

  • Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 1 - 22.doi (GLM regularization paths article)

  • Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J. and Tibshirani, R. J. (2012), Strong rules for discarding predictors in lasso-type problems. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 74: 245–266. doi:10.1111/j.1467-9868.2011.01004.x