TD_ARIMAESTIMATE Syntax Elements - Teradata Vantage

Database Unbounded Array Framework Time Series Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2023-12-08
dita:mapPath
ncd1634149624743.ditamap
dita:ditavalPath
ruu1634160136230.ditaval
dita:id
ncd1634149624743
SERIES_SPEC

An input source containing univariate series instances. The input is processed depending on whether the second input is present or absent. If second input is absent, then TD_ARIMAESTIMATE estimates the model coefficients corresponding to the passed in series instances. If second input is present, then TD_ARIMAESTIMATE applies a previous model, whose binary model string is referenced by the second input, to the passed in series instances. This has the effect of creating the ART context necessary for a subsequent call to TD_ARIMAFORECAST.

Entering a single series with INPUT_FMT(MODE()) causes a SQL syntax error.

See Series Specifications.

SERIES_SPEC or ART_SPEC
[Optional second input] This secondary input is only used when you want to apply an ARIMA model generated by a previous TD_ARIMAESTIMATE call or a TD_ARIMAVALIDATE call. In either case, the SERIES_SPEC or ART_SPEC refers to the binary string generated during the TD_ARIMAESTIMATE or TD_ARIMAVALIDATE call.

The SERIES_SPEC references an input source containing the ARTMODEL binary string previously generated with a call to either TD_ARIMAESTIMATE with FIT_PERCENTAGE(100) or TD_ARIMAVALIDATE.

The ART_SPEC references an ART input source produced by TD_ARIMAESTIMATE (with FIT_PERCENTAGE(100)) or TD_ARIMAVALIDATE, and includes the ARTMODEL layer. The ART_SPEC is required to have the parameters TABLE_NAME and LAYER(ARTMODEL). No other ART_SPEC parameters should be used.
ART_SPEC( TABLE_NAME( [database-name .] table-name ), LAYER(layer-name) )
When INPUT_FMT(MODE()) is passed in, the following behaviors occur:
  • MATCH mode: If no matching identifiers are found, then an empty table (no rows) is returned.
  • MANY2ONE or ONE2ONE: If the primary input is an empty table, then the function returns an empty table. If the secondary input is an empty table, than the SQL error “Empty secondary input in ONE2ONE/MANY2ONE input mode” is returned by the function.

    If either of the input tables do not exist (never created), than a SQL error is returned.

FUNC_PARAMS
Name Data Type Description
NONSEASONAL (MODEL_ORDER) Integer list The non-seasonal values for the model.
  • p-value: The order of the non-seasonal auto-regression (AR) component.
  • d-value: The order of the non-seasonal differences between consecutive components.
  • q-value: The order of the non-seasonal moving average (MA) component.

See order limitation in ALGORITHM.

SEASONAL (MODEL_ORDER) Integer list [Dependencies: Optional for seasonal modeling. This can be used with all algorithms except ALGORITHM(OLE). Using ALGORITHM(OLE) causes a SQL error.] The seasonal values for the model.
  • P-value: The order of the seasonal auto-regression (SAR) component.
  • D-value: The order of the seasonal differences between consecutive components.
  • Q-value: The order of the seasonal moving average (SMA) component.

Default value is {0,0,0}, indicating that no seasonal modeling.

See order limitation in ALGORITHM.

SEASONAL (PERIOD) Integer [Dependencies: Required when using seasonal modeling.] The number of periods per season. Default value is 1, indicating the series is one period.

See order limitation in ALGORITHM.

LAGS (AR) Integer list [Optional] Used to support non-contiguous models. Position-sensitive list that specifies the lags to be associated with the non-seasonal auto-regressive (AR) regression terms. Default is LAGS ( AR(1,2,3,...p)). The p-length-lag-list is the lag values for the non-seasonal auto-regression component.
LAGS (SAR) Integer list [Dependencies: Only used optionally with seasonal modeling.] Used to support non-contiguous models. Position-sensitive list that specifies the lags associated with the seasonal auto-regressive (SAR) terms. Default is LAGS( SAR(1, 2, ...., P)). The P-length-lag-list is the seasonal auto-regression components.
LAGS (MA) Integer list [Optional] Used to support non-contiguous models. Position-sensitive list that specifies the lags associated with the non-seasonal moving average (MA) terms. Default is LAGS (MA(1, 2, ..., q)). The q-length-lag-list is the values for the moving average component.
LAGS (SMA) Integer list [Dependencies: Only used optionally with seasonal modeling.] Used to support non-contiguous models. Position-sensitive list that specifies the lags associated with the seasonal moving average (SMA) terms. Default is LAGS (SMA(1, 2, ..., Q)). The Q-length-lag-list is the values for the seasonal moving average component.
INIT FLOAT list [Dependencies: Only used optionally with all algorithms except ALGORITHM(OLE). Using ALGORITHM(OLE) causes a SQL error.] Position-sensitive list that specifies the initial values to be associated with the non-seasonal AR regression coefficients, followed by the non-seasonal MA coefficients, the seasonal SAR regression coefficients and the SMA coefficients. The formula is as follows:
p+q+P+Q+CONSTANT-length-init-list

Default is an appropriately-sized list of zeros :{0,0,…,0}.

FIXED FLOAT list [Dependencies: Only used optionally with all algorithms except ALGORITHM(OLE). Using ALGORITHM(OLE) causes a SQL error.] Position-sensitive list that specifies the fixed values to be associated with the non-seasonal AR regression coefficients, followed by the non-seasonal MA coefficients, the SAR coefficients and the SMA coefficients.
If an intercept is needed, one more value is added at the end to specify the intercept coefficient initial value. The formula is as follows:
p+q+P+Q+CONSTANT-length-fixed-list

Default is an appropriatel-sized list of negative 1000, such as {-1000,-1000,…,-1000}. The values indicate that the coefficient is not fixed and should be computed.

CONSTANT Integer Indicator for the TD_ARIMAESTIMATE function to calculate an intercept. A value of 1 indicates intercept should be calculated. A value of 0 indicates no intercept should be calculated.
ALGORITHM Enum, String The method to estimate the coefficients.
  • ALGORITHM (OLE): Use the sum of ordinary least squares approach.
  • ALGORITHM (MLE): Use maximum likelihood approach.
  • ALGORITHM (CSS_MLE): Use the conditional sum-of-squares to determine a start value and then do maximum likelihood.
  • ALGORITHM (CSS): Use the conditional sum-ofsquares approach..
Note the following:
  • MLE and MLE_CSS may cause the error 9153 UAF error: non-finite finite-difference value. This could happen when the model is not stationary. To avoid the error, use CSS algorithm or different MODEL_ORDER which should base on PACF and ACF result.
  • MLE, MLE_CSS, and CSS may cause the warning 9155 UAF Function Warning: BFGS approximation may not converge. Do not trust the parameter estimate results that did not converge.
  • MLE and MLE_CSS may cause the error: 9153 UAF error: maximum lag for MLE/MLE_CSS is 100. This occurs when the value for r is greater than 100 in the following equation:
    r = max (P * s + p, Q * s + q + 1)
MAX_ITERATIONS Integer [Dependencies: Only used optionally with ALGORITHM (MLE) processing.] The limit on the maximum number of iterations that can be employed to estimate the ARIMA parameters. When not present, the default is 100 iterations.
COEFF_STATS Integer [Optional] Indicator to return coefficient statistical columns TSTAT_VALUE and TSTAT_PROB. A value of 1 means return the columns. A value of 0 means do not return the columns. Default is 0.
FIT_PERCENTAGE Integer [Optional] Percentage of passed-in sample points that are used for the model fitting and parameter estimation. The default value is 100, meaning 100%.
FIT_METRICS Integer [Optional] Indicator to generate the secondary result set that contains the model metadata statistics. A value of 1 means generate the secondary result set. A value of 0 means do not generate the secondary result set. The default value is 0.

The generated result set is retrieved by issuing the TD_EXTRACT_RESULTS function that references the ARTFITMETADATA layer on the analytical result table containing the results.

RESIDUALS Integer [Optional] Indicator to generate the tertiary result set that contains the model residuals. A value of 1 means generate the tertiary result set. A value of 0 means do not generate the tertiary result set. The default value is 0.

The generated result set is retrieved by issuing a TD_EXTRACT_RESULTS function, which references the ARTFITRESIDUALS layer on the analytical result table containing the results.

The following parameters are ignored when using two input source are passed to the function:
  • ALGORITHM
  • CONSTANT
  • FIXED
  • INIT
  • LAGS
  • NONSEASONAL
  • SEASONAL

However, the ALGORITHM, CONSTANT and NONSEASONAL parameters must be included in the TD_ARIMAESTIMATE statement, as they are mandatory.

INPUT_FMT
  • Only Primary input is present, no INPUT_FMT options are available.
  • Primary input and secondary input are present, the following INPUT_FMT options are available:
    • ONE2ONE: Both the primary and secondary series specifications include a WHERE filter clause identifying one series instance to serve as the input series for the mathematical operation.
    • MANY2ONE: The primary input SERIES_SPEC references MANY series instances. The secondary input SERIES_SPEC includes a WHERE filter clause identifying a single series instance. When applying the mathematical operation, the secondary input single series is reused as many times as necessary to match the number of series instances found in the primary input.
    • MATCH: A series instance residing in the primary input whose SERIES_ID matches the SERIES_ID of a series instance in the secondary input, has the mathematical operation applied to produce a result series. For those instances in one input who have no corresponding SERIES_ID partner in the other input, the series is skipped over.
OUTPUT_FMT
[Optional] Specify the INDEX_STYLE of the output format. Options are NUMERICAL_SEQUENCE and FLOW_THROUGH. The default is NUMERICAL_SEQUENCE.
The OUTPUT_FMT options only apply to the ARTFITRESIDUALS layer and ARTVALDATA layer.