TD_ARIMAESTIMATE Output - Teradata Vantage

Database Unbounded Array Framework Time Series Functions

Deployment

VantageCloud

VantageCore

Edition

Enterprise

IntelliFlex

VMware

Product

Teradata Vantage

Release Number

17.20

Published

June 2022

Language

English (United States)

Last Update

2023-12-08

dita:mapPath

ncd1634149624743.ditamap

dita:ditavalPath

ruu1634160136230.ditaval

dita:id

ncd1634149624743

This function outputs a result set that contains the estimated coefficients with accompanying per-coefficient statistical ratings. This result set is retrieved by issuing a SELECT against the ART.

The function can output a secondary (ARTFITMETADATA) result set with goodness-of-fit metadata, a tertiary (ARTFITRESIDUALS) result set with residuals from the fitting operation, a quaternary (ARTMODEL) result set with validation model context, and a quinary (ARTVALDATA) result set with the validation series. These result sets are retrieved directly using the TD_EXTRACT_RESULTS function. Additionally, the residuals can be indirectly referenced by passing the ART name into a UAF function that requires the residual series as an input.

RETURNS TABLE Schema for Primary Result Set

Name	Data Type	Description
derived-series-identifier	Varies	The resultant series identifying the field list.
ROW_I	BIGINT	The indices, starting with 0, are associated with coefficients in accordance to order in which they appeared in the formula. The last index is the constant coefficient, if constant was requested.
COEFF_NAME	VARCHAR (128)	Name of the coefficient.
COEFF_VALUE	FLOAT	The calculated value of the coefficient determined by the regression process.
STD_ERROR	FLOAT	The standard error associated with the calculated value for that coefficient. Only returned if COEFF_STATS(1).
ZSTAT_VALUE	FLOAT	Column is included only if MLE/CSS_MLE/CSS algorithms are used. The z-statistic associated with the calculated value for that coefficient. Purpose is to determine how significant a given coefficient is, relative to a statistical value of 0. Only returned if COEFF_STATS(1).
ZSTAT_PROB	FLOAT	Column is included only if MLE/CSS_MLE/CSS algorithms are used. The probability associated with the z-statistic. It is the probability of obtaining an absolute value of z as large as the one that was calculated for the data, if the coefficient is 0. Only returned if COEFF_STATS(1).
TSTAT_VALUE	FLOAT	Column is included only if OLE algorithm is used. For OLE algorithm, the t-statistic associated with the calculated value for that coefficient. Purpose is to determine how significant a given coefficient is relative to a statistical value of 0. Only returned if COEFF_STATS(1). For coefficient A : t-statisticA = value of A / STD_ERROR(A) For coefficient B : t-statisticB = value of B / STD_ERROR(B) and so on.
TSTAT_PROB	FLOAT	Column is included only if OLE algorithm is used. The probability associated with the t-statistic. It is the probability of obtaining an absolute value of t as large as the one calculated for the data, if the coefficient is equal to 0. Only returned if COEFF_STATS(1).

RETURNS TABLE Schema for Secondary Result Set (ARTFITMETADATA)

Name	Data Type	Description
derived-series-identifier	Varies	The resultant series identifying the field list.
ROW_I	BIGINT	Index associated with the result series in this layer.
NUM_SAMPLES	INTEGER	The number of sample points used to fit the model.
VAR_COUNT	INTEGER	Number of explanatory variables, including the constant, in the original regression.
R_SQUARE	FLOAT	The calculated R-squared value from the original and calculated values.
R_ADJ_SQUARE	FLOAT	The calculated adjusted R-squared value from the original and calculated values.
STD_ERROR	FLOAT	The standard error or deviation associated with the model.
STD_ERROR_DF	INTEGER	The degrees of freedom associated with the standard error calculation.
ME	FLOAT	The Mean Error.
MAE	FLOAT	The Mean Absolute Error.
MSE	FLOAT	The Mean Squared Error.
MPE	FLOAT	The Mean Percent Error.
MAPE	FLOAT	The Mean Absolute Percent Error.
F_STAT_CALC	FLOAT	The calculated F-statistic value for the ordinary least squares (OLS) regression.
P_VALUE	FLOAT	The p-value corresponding to the calculated test statistic.
NUM_DF	INTEGER	The degrees of freedom in the numerator associated with the unexplained portion of the F-statistic.
DENOM_DF	INTEGER	The degrees of freedom in the denominator associated with the explained portion of the F-statistic.
SIGNIFICANCE_LAYER	FLOAT	Level of significance for the test.
F_CRITICAL	FLOAT	The chi-squared critical value extracted from the chi-squared statistic tables.
F_CRITCAL_P	FLOAT	The p-value corresponding to the calculated critical value.
NULL_HYPOTH	VARCHAR(25)	The result of the test. ACCEPT means the null hypothesis is accepted, and there is no serial correlation evident. REJECT means the null hypothesis is rejected, and there is evidence of serial correlation.

RETURNS TABLE Schema for Tertiary Result Set (ARTFITRESIDUALS)

Name	Data Type	Description
derived-series-identifier	Varies	The resultant series identifying the field list.
ROW_I	Varies	Indexing column for the multivariate output array containing the residuals. Its associated data type is dependent on the OUTPUT_FMT (INDEX_STYLE) input. If NUMERICAL_SEQUENCE is used, then the data type is BIGINT and it is incremented by 1 for each row, starting from 0. If FLOW_THROUGH is used, then the value could be another data type based on the data type of the passed in ROW_AXIS.
ACTUAL_VALUE	FLOAT	The actual value of the response variable.
CALC_VALUE	FLOAT	The calculated value of the response variable using the model.
RESIDUAL	FLOAT	The difference between the calculated response value and the actual response value.

RETURNS TABLE Schema for Quaternary Result Set (ARTMODEL)

Name	Data Type	Description
derived-series-identifier	Varies	The resultant series identifying the field list.
ROW_I	BIGINT	The model row number. Used to support model sizes larger than 32000 bytes.
MODEL_DATA	VARBYTE (32000)	Model context in binary form.

RETURNS TABLE Schema for Quinary Result Set (ARTVALDATA)

Name	Data Type	Description
derived-series-identifier	Varies	The resultant series identifying the field list.
ROW_I	Varies	Indexing column for the multivariate output array containing the residuals. Its associated data type is dependent upon the OUTPUT_FMT(INDEX_STYLE). If NUMERICAL_SEQUENCE is used, then the data type is BIGINT and it is incremented by 1 for each row, starting from 0. If FLOW_THROUGH is used, then the value is another data type based on the data type of the passed in ROW_AXIS.
VAL_ACTUAL_VALUE	FLOAT	The unused sample data actual value. This is the portion of the series reserved for validation.
VAL_CALC_VALUE	FLOAT	The unused sample data fitted value. This is the portion of the series reserved for validation.
VAL_RESIDUAL	FLOAT	The unused sample data residuals. This is the portion of the series reserved for validation.