CCM Syntax Elements - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product

Teradata Vantage

Release Number

9.02

9.01

2.0

1.3

Published

February 2022

Language

English (United States)

Last Update

2022-02-10

dita:mapPath

rnn1580259159235.ditamap

dita:ditavalPath

ybt1582220416951.ditaval

dita:id

B700-4003

lifecycle

Product Category

Teradata Vantage™

Tip: Read the description of the EmbeddingDimensions syntax element first.

SequenceIDColumn

Specify the name of the InputTable column that contains the time sequence identifiers. A time sequence is a sample of the time series.

TimeColumn

Specify the name of the InputTable column that contains the time stamps.

CauseColumns

Specify the names of the InputTable columns that contain values to evaluate as potential causes.

To select best embedding dimension, CauseColumns and EffectColumns must specify the same single column.

EffectColumns

Specify the names of the InputTable columns that contain values to evaluate as potential effects.

To select best embedding dimension, CauseColumn and EffectColumn must specify the same single column.

LibrarySize

[Optional] If the function selects the best embedding dimension, you must omit this syntax element.

Specify the sizes of the libraries that the function uses. Each library contains randomly selected points along the potential effect time series. The function uses the libraries to predict values of the cause time series. If the correlation between the predicted values of the cause time series and the actual values increases as the size of the library increases, a causal relationship is said to exist.

Each library_size must be a positive INTEGER.

If the function does not select the best embedding dimension, and you omit this syntax element, the function uses two library sizes: 100 and dimension * time_step + 1 (dimension and time_step are specified by the syntax elements EmbeddingDimensions and TimeStep, respectively).

If you specify a single library_size, the function uses two library sizes: library_size and dimension * time_step + 1.

EmbeddingDimensions

[Optional] Specify the number of past values to use when predicting a given value of the time series. If you specify only one dimension, the function uses it. If you specify multiple dimensions, the function selects the best one.

Each dimension must be a positive INTEGER. Default: 2

PredictStep

[Optional] Specify the number of time steps into the future to make predictions from past observations, if the function selects the best embedding dimension.

The predict_step must be a positive INTEGER. Default: 1

TimeStep

[Optional] Specify the number of time steps between past values to use when predicting a given value of the time series.

The time_step must be a positive INTEGER. Default: 1

BootstrapIterations

[Optional] Specify the number of bootstrap iterations to use when predicting a given value of the time series. The function uses the bootstrap process to estimate the uncertainty associated with the predicted values. If the function selects the best embedding dimension, iterations has the value 1, and the function ignores this syntax element if it is specified.

The iterations must be a positive INTEGER. Default: 100

PointSelectRule

[Optional] Specify the rule for selecting the nearest points if the function is to select the best embedding dimension:

Option	Description
'DistanceOnly' (Default)	Function determines nearest points based only on computed distance.
'DistanceAndTime'	Function determines nearest points based on both computed distance and time, which matches procedure described in documentation for R package multispatialCCM (version 1.0).

ExecutionMode

[Optional] Specify the execution mode:

Option	Description
'single' (Default)	Function runs on single node and outputs results similar to those produced by R package multispatialCCM (version 1.0). Use 'single' unless InputTable is large or 'single' is slow.
'distribute'	Function distributes data and runs across multiple vworkers. To parallelize run, function uses an algorithm different from that of R, and its results can be different from those of R.

SelfPredict

[Optional] Specify whether the function tries to predict each attribute using the attribute itself. If an attribute can predict its own time series well, the signal-to-noise ratio is too low for the CCM algorithm to work effectively.

If the function selects the best embedding dimension, this syntax element must specify 'true'.

Default: 'false'

Seed

[Optional] Specify the random seed the algorithm uses for repeatable results. The seed must be a LONG value.

For repeatable results, use both the Seed and UniqueID syntax elements. For more information, see Nondeterministic Results and UniqueID Syntax Element.