Summary - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software

Cross-validation, sometimes called rotation estimation, is a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of previously unseen data against which the model is tested (testing dataset). The goal of cross validation is to define a dataset to “test” the model in the training phase (the validation dataset) to provide insight into how the model will generalize to an independent dataset. Cross-validation can be useful to identify and avoid overfitting problems.

Cross-validation works as follows: the data are randomly partitioned into k equal-sized subsamples. One group is kept aside as a validation set, and the model is trained on the rest of the data. The trained model is used on the validation set and the error rate is calculated. The process is repeated k times, with each of the k subsamples used as the validation set in turn.

K-fold cross-validation