CCM Arguments - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software
Tip: Read the description of the EmbeddingDimensions argument first.
InputTable
Specifies the name of the table that contains the input data.
SequenceIdColumn
Specifies the name of the input table column that contains the time sequence identifiers. A time sequence is a sample of the time series.
TimeColumn
Specifies the name of the input table column that contains the time stamps.
CauseColumns
Specifies the names of the input table columns that contain values to be evaluated as potential causes.

To select best embedding dimension, CauseColumns and EffectColumns must specify the same single column.

EffectColumns
Specifies the names of the input table columns that contain values to be evaluated as potential effects.

To select best embedding dimension, CauseColumn and EffectColumn must specify the same single column.

LibrarySize
[Optional] If the function selects the best embedding dimension, you must omit this argument.

Specifies the sizes of the libraries that the function uses. Each library contains randomly selected points along the potential effect time series. The function uses the libraries to predict values of the cause time series. If the correlation between the predicted values of the cause time series and the actual values increases as the size of the library increases, a causal relationship is said to exist.

Each library_size must be a positive INTEGER.

If the function does not select the best embedding dimension, and you omit this argument, the function uses two library sizes: 100 and dimension * time_step + 1 (dimension and time_step are specified by the arguments EmbeddingDimensions and TimeStep, respectively).

If you specify a single library_size, the function uses two library sizes: library_size and dimension * time_step + 1.

EmbeddingDimensions
[Optional] Specifies the number of past values to use when predicting a given value of the time series.

Each dimension must be a positive INTEGER. Default: 2.

If you specify only one dimension, the function uses it. If you specify multiple dimensions, the function selects the best one.

PredictStep
[Optional] Specifies the number of time steps into the future to make predictions from past observations, if the function selects the best embedding dimension.

The predict_step must be a positive INTEGER. Default: 1.

TimeStep
[Optional] Specifies the number of time steps between past values to use when predicting a given value of the time series.

The time_step must be a positive INTEGER. Default: 1.

BoostrapIterations
[Optional] Specifies the number of bootstrap iterations to use when predicting a given value of the time series. The function uses the bootstrap process to estimate the uncertainty associated with the predicted values.

The iterations must be a positive INTEGER. Default: 100.

If the function selects the best embedding dimension, iterations has the value 1, and the function ignores this argument if it is specified.

PointSelectRule
[Optional] Specifies the rule for selecting the nearest points if the function is to select the best embedding dimension:
  • 'DistanceOnly' (Default)

    The function determines the nearest points based only on computed distance.

  • 'DistanceAndTime'

    The function determines the nearest points based on both computed distance and time, which matches the procedure described in the documentation for the R package multispatialCCM (version 1.0).

Mode
[Optional] Specifies the execution mode:
  • 'single' (Default)

    The function runs on a single node and outputs results similar to those produced by the R package multispatialCCM (version 1.0).

    Use 'single' unless the input table is large or 'single' is slow.

  • 'distribute'

    The function distributes the data and runs across multiple vworkers. To parallelize the run, the function uses an algorithm different from that of R, and its results can be different from those of R.

SelfPredict
[Optional] Specifies whether the function tries to predict each attribute using the attribute itself. If an attribute can predict its own time series well, the signal-to-noise ratio is too low for the CCM algorithm to work effectively. Default: 'false'.
If the function selects the best embedding dimension, this argument must specify 'true'.
Seed
[Optional] Specifies the random seed used to initialize the algorithm. The seed must be a LONG value.