CrossValidation - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product

Aster Analytics

Release Number

7.00.02

Published

September 2017

Language

English (United States)

Last Update

2018-04-17

dita:mapPath

uce1497542673292.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1022

lifecycle

Product Category

Software

Cross-validation (or rotation estimation) is a model-validation technique for assessing how the results of a statistical analysis generalize to an independent data set. Use this technique when your goal is prediction, to estimate how accurately a predictive model performs in practice.

You train a model on a training set (a data set for which you know the response variable) and validate the model on a test set or validation set (a different data set for which you know the response variable). The CrossValidation function lets you have multiple test sets by partitioning the training set. The function thus provides significant insights into how a model generalizes to an independent data set.

The CrossValidation function works as follows:

It partitions the data randomly into k equal-sized subsamples.
It keeps one group as a test set and trains the model on the rest of the data.
It uses the trained model on the test set and calculates the error rate.
It repeats the preceding steps k times, using each of the k subsamples as the test set.

For the function to work, you must first partition the data set, and you must not train the model on the test set.

K-Fold Cross-Validation

This function does not work in an SSL-enabled cluster.