1.1 - 8.10 - CCM Example: Simulated Data - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)
This example uses a simulated data set to show how to use the CCM function to:
  1. Identify the optimal value for EmbeddingDimensions.
  2. Check for a causal relationship between two time series.

Input

InputTable: ccmexample
seqid t a b
1 1 0.439016523 0.844698604
1 2 0.79590473 0.416313404
1 3 0.457454911 0.80120226
1 4 0.83460391 0.462840003
1 5 0.453855618 0.866674285
1 6 0.847111468 0.420195438
1 7 0.464311363 0.840110673
1 8 0.854059164 0.474302962
1 9 0.440280147 0.863834294
1 10 0.809889391 0.425037187
2 1 0.773946283 0.63958518
2 2 0.508680994 0.850617675
... ... ... ...
2 10 0.766623465 0.940756652
3 1 0.813294227 0.789552227
3 2 0.247789031 0.549992501
... ... ... ...
3 10 0.793969139 0.608697646
... ... ... ...
... ... ... ...
10 1 0.503156674 0.794651776
10 2 0.823104545 0.580508316
... ... ... ...
10 10 0.812508422 0.406036663

Step 1 SQL Call: Identify Optimal Value for EmbeddingDimensions

The CauseColumns and EffectColumns syntax elements must have the same value, the SelfPredict syntax element must have the value 'true', and the LibrarySize syntax element must be omitted.

SELECT * from CCM (
  ON ccmexample AS InputTable
  USING
  SequenceIDColumn ('seqid')
  TimeColumn ('t')
  CauseColumns ('b')
  EffectColumns ('b')
  EmbeddingDimensions (2,3,4,5,6,7,8,9,10)
  SelfPredict ('t')
) AS dt;

Step 1 Output

 cause effect library_size correlation       jaccard_index lower_bound       upper_bound       effect_size effect_size_sd embedding_dimension 
 ----- ------ ------------ ----------------- ------------- ----------------- ----------------- ----------- -------------- ------------------- 
 b     b                60 0.663700687953145          NULL 0.663700687953145 0.663700687953145         0.0            0.0                   4

Step 2 SQL Call: Check for Causal Relationship Between Two Time Series

The EmbeddingDimensions syntax element has the optimal value identified in Step 1.

SELECT * from CCM (
  ON ccmexample AS InputTable
  USING
  SequenceIDColumn ('seqid')
  TimeColumn ('t')
  CauseColumns ('a','b')
  EffectColumns ('a','b')
  EmbeddingDimensions ('2')
) AS dt;

Step 2 Output

 cause effect library_size correlation       jaccard_index lower_bound        upper_bound       effect_size       effect_size_sd     
 ----- ------ ------------ ----------------- ------------- ------------------ ----------------- ----------------- ------------------ 
 b     a                 3 0.199821148005778          NULL  0.154320762926768 0.244475768800156 0.411670632601633 0.0258400884667059
 b     a               100 0.547088662768567          NULL    0.5336993960053 0.560203717679341              NULL               NULL
 a     b                 3 0.133354380025576          NULL 0.0884539033592824 0.177714016526658 0.157908314401436 0.0250404016758242
 a     b               100 0.284031327946218          NULL   0.26695597109749 0.300928402624589              NULL               NULL

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.