Summary - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product

Aster Analytics

Release Number

6.21

Published

November 2016

Language

English (United States)

Last Update

2018-04-14

dita:mapPath

kiu1466024880662.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1021

lifecycle

Product Category

Software

Cross-validation, sometimes called rotation estimation, is a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run (training dataset), and a dataset of previously unseen data against which the model is tested (testing dataset). The goal of cross validation is to define a dataset to “test” the model in the training phase (the validation dataset) to provide insight into how the model will generalize to an independent dataset. Cross-validation can be useful to identify and avoid overfitting problems.

Cross-validation works as follows: the data are randomly partitioned into k equal-sized subsamples. One group is kept aside as a validation set, and the model is trained on the rest of the data. The trained model is used on the validation set and the error rate is calculated. The process is repeated k times, with each of the k subsamples used as the validation set in turn.

K-fold cross-validation