CrossValidation Function | Teradata Vantage - CrossValidation (ML Engine) - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Cross-validation (or rotation estimation) is a model-validation technique for assessing how the results of a statistical analysis generalize to an independent data set. Use this technique when your goal is prediction, to estimate how accurately a predictive model performs in practice.

Typically, you train a model on a training set (a data set for which you know the response variable) and validate the model on a test set or validation set (a different data set for which you know the response variable). The CrossValidation function lets you have multiple test sets by partitioning the training set. The function can thus provide insight into how a model might generalize to an independent data set.

The CrossValidation function works as follows:
  1. It partitions the data randomly into k equal-sized subsamples.
  2. It keeps one group as a test set and trains the model on the rest of the data.
  3. It uses the trained model on the test set and calculates the error rate.
  4. It repeats the preceding steps k times, using each of the k subsamples as the test set.
K-Fold Cross-Validation
How Machine Learning Engine function CrossValidation works