CrossValidation Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example calculates the cross-validation error for four GLM models based on the Gaussian family.

Input

The input table, housing_train, is from GLM Example: Gaussian Distribution Analysis.

SQL Call

The LinkFunction and Intercept syntax elements specify the four GLM models to validate.

SELECT * FROM CrossValidation (
  ON housing_train AS InputTable
  OUT TABLE CrossValidationErrorTable (glmcvtable)
  USING
    Family ('gaussian')
    FunctionName ('glm')
    InputColumns ('price ','lotsize ','bedrooms ','bathrms ',
      'stories ','garagepl','driveway ','recroom ','fullbase ','gashw ',
      'airco ','prefarea','homestyle')
    CategoricalColumns ('driveway ','recroom ','fullbase ','gashw ',
      'airco ','prefarea','homestyle')
    LinkFunction ('identity','log','identity','log')
    Intercept ('t','f','f','t')
    FoldNum (3)
    CVParams ('LinkFunction','Intercept')
    Metric ('MSE')
) AS dt;

Output

message
--------------------------------------------------------------------------
Finished. Results can be found in table specified in the argument CVTable
SELECT * FROM glmcvtable;
linkfunction   intercept  model               cverror
------------   ---------- ------------------  ----------------------
identity         t        "cv_outputtable_0"  1.15206987686268E 008
log              f        "cv_outputtable_1"  5.33687796406253E 009
log              t        "cv_outputtable_3"  5.33687796401766E 009
identity         f        "cv_outputtable_2"  1.15206981069037E 008

The cross-validation error shows that the default link function, identity, performs better than the log link function.

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.