Validate ML Engine Analytic Functions | Teradata Vantage - Validating Machine Learning Engine Analytic Functions - Teradata Vantage

Configuring Teradata Vantage™ After Installation

Product
Teradata Vantage
Release Number
2.2
Published
January 2021
Language
English (United States)
Last Update
2021-05-08
dita:mapPath
gta1588198788318.ditamap
dita:ditavalPath
dby1605127568110.ditaval
dita:id
B700-4014
Product Category
Analytical Ecosystem
Software
Teradata Vantage

For information about Machine Learning Engine analytic functions, see Teradata Vantage™ Machine Learning Engine Analytic Function Reference, B700-4003.

  1. From a client application (such as BTEQ or Teradata Studio), log on to the Vantage system as a user with the necessary permissions to access Machine Learning Engine. For example:
    .logon systemname.your_company.com/hal
  2. If you are running the commands from the BTEQ command line, set SIDETITLES and FOLDLINE to ON, so wide output is readable:
    .SET FOLDLINE ON ALL
    .SET SIDETITLES ON ALL
  3. Run the following Attribution query. This query uses the Machine Learning Engine version of Attribution, which is called with the syntax: Attribution_MLE
    When using functions that exist on both Advanced SQL Engine and Machine Learning Engine, you must use the suffix _MLE to use the Machine Learning Engine version. (The functions nPath® and NTree are exceptions. For these functions, you must use the suffix @coprocessor to use the Machine Learning Engine version.)
    SELECT * FROM Attribution_MLE (
    ON attribution_1 AS input1
    PARTITION BY user_id
    ORDER BY time_stamp
    ON attribution_2 AS input2
    PARTITION BY user_id
    ORDER BY time_stamp
    ON conversion_event_table AS ConversionEventTable DIMENSION
    ON excluding_event_table AS ExcludedEventTable DIMENSION
    ON optional_event_table AS OptionalEventTable DIMENSION
    ON model1_table AS FirstModelTable DIMENSION
    ON model2_table AS SecondModelTable DIMENSION
    USING
    EventColumn ('event')
    TimeColumn ('time_stamp')
    WindowSize ('rows:10&seconds:20')
    ) AS dt ;
    
    The output is similar to the following:
    user_id event         time_stamp                 attribution         time_to_conversion 
    ------- ------------- -------------------------- ------------------- ------------------ 
          1 impression    2001-09-27 23:00:01.000000 0.28571428571428575              -19.0
          1 impression    2001-09-27 23:00:05.000000                 0.0                  ?
          1 impression    2001-09-27 23:00:19.000000  0.7142857142857143               -1.0
          1 socialnetwork 2001-09-27 23:00:20.000000                   ?                  ?
          1 direct        2001-09-27 23:00:21.000000                 0.5               -2.0
          1 referral      2001-09-27 23:00:22.000000                 0.5               -1.0
          1 paidsearch    2001-09-27 23:00:23.000000                   ?                  ?
  4. When you use Machine Learning Engine functions that do not have counterparts on Advanced SQL Engine, a suffix is not required. Create the input tables:
    create multiset table kmodes_input   (model varchar(30), mpg double PRECISION, cyl integer, disp double PRECISION, hp double PRECISION, drat double PRECISION, wt double PRECISION, qsec double PRECISION, vs varchar(30), am varchar(30), gear integer, carb integer);
    
    insert into kmodes_input values('mazda rx4',0.150884825,6,-0.570619819,-0.53509284,0.567513685,-0.610399567,-0.777165145,'s','manual',4,4);
    
    insert into kmodes_input values('mazda rx4 wag',0.150884825,6,-0.570619819,-0.53509284,0.567513685,-0.349785269,-0.46378082,'s','manual',4,4);
    
    insert into kmodes_input values('datsun 710',0.449543447,4,-0.990182091,-0.783040459,0.473999587,-0.917004624,0.426006817,'v','manual',4,1);
    
    insert into kmodes_input values('hornet 4 drive',0.217253407,6,0.220093694,-0.53509284,-0.96611753,-0.002299538,0.890487156,'v','automatic',3,1);
    
    insert into kmodes_input values('hornet sportabout',-0.230734526,8,1.043081228,0.412942174,-0.835197792,0.227654255,-0.46378082,'s','automatic',3,2);
    
    insert into kmodes_input values('valiant',-0.3302874,6,-0.046166978,-0.60801861,-1.564607761,0.248094592,1.326986752,'v','automatic',3,1);
    
    insert into kmodes_input values('duster 360',-0.960788935,8,1.043081228,1.433902959,-0.722980874,0.360516446,-1.124126363,'s','automatic',3,4);
    
    insert into kmodes_input values('merc 240d',0.715017777,4,-0.677930938,-1.235180235,0.174754472,-0.027849959,1.203871481,'v','automatic',4,2);
    
    insert into kmodes_input values('merc 230',0.449543447,4,-0.725535119,-0.753870151,0.604919325,-0.068730634,2.826754593,'v','automatic',4,2);
    
    insert into kmodes_input values('merc 280',-0.147773797,6,-0.509299179,-0.345485837,0.604919325,0.227654255,0.252526208,'v','automatic',4,4);
    
    insert into kmodes_input values('merc 280c',-0.380063837,6,-0.509299179,-0.345485837,0.604919325,0.227654255,0.588295128,'v','automatic',4,4);
    
    insert into kmodes_input values('merc 450se',-0.612353876,8,0.363713088,0.485867945,-0.98482035,0.871524874,-0.251127171,'s','automatic',3,3);
    
    insert into kmodes_input values('merc 450sl',-0.463024565,8,0.363713088,0.485867945,-0.98482035,0.524039143,-0.139204198,'s','automatic',3,3);
    
    insert into kmodes_input values('merc 450slc',-0.811459624,8,0.363713088,0.485867945,-0.98482035,0.575139986,0.084641749,'s','automatic',3,3);
    
    insert into kmodes_input values('cadillac fleetwood',-1.607882616,8,1.946753815,0.850496796,-1.246659826,2.077504765,0.073449451,'s','automatic',3,4);
    
    insert into kmodes_input values('lincoln continental',-1.607882616,8,1.849931752,0.996348337,-1.115740088,2.255335698,-0.016088927,'s','automatic',3,4);
    
    insert into kmodes_input values('chrysler imperial',-0.894420352,8,1.688561647,1.215125648,-0.685575235,2.174596366,-0.239934874,'s','automatic',3,4);
    
    insert into kmodes_input values('fiat 128',2.042389431,4,-1.226589294,-1.176839619,0.90416444,-1.039646647,0.907275602,'v','manual',4,1);
    
    insert into kmodes_input values('honda civic',1.710546517,4,-1.25079481,-1.381031775,2.493904115,-1.637526508,0.375641479,'v','manual',4,2);
    
    insert into kmodes_input values('toyota corolla',2.291271616,4,-1.287909934,-1.191424773,1.166003916,-1.4126828,1.147909994,'v','manual',4,1);
    
    insert into kmodes_input values('toyota corona',0.233845553,4,-0.892553178,-0.724699843,0.193457291,-0.76881218,1.20946763,'v','automatic',3,1);
    
    insert into kmodes_input values('dodge challenger',-0.761683187,8,0.704204008,0.048313323,-1.564607761,0.309415603,-0.54772305,'s','automatic',3,2);
    
    insert into kmodes_input values('amc javelin',-0.811459624,8,0.591244935,0.048313323,-0.835197792,0.22254417,-0.307088658,'s','automatic',3,2);
    
    insert into kmodes_input values('camaro z28',-1.126710392,8,0.962396176,1.433902959,0.24956575,0.636460997,-1.364760755,'s','automatic',3,4);
    
    insert into kmodes_input values('pontiac firebird',-0.147773797,8,1.365821438,0.412942174,-0.96611753,0.641571082,-0.446992374,'s','automatic',3,2);
    
    insert into kmodes_input values('fiat x1-9',1.196190002,4,-1.224168743,-1.176839619,0.90416444,-1.310481114,0.588295128,'v','manual',4,1);
    
    insert into kmodes_input values('porsche 914-2',0.980492108,4,-0.890939476,-0.812210767,1.55876313,-1.100967659,-0.642857578,'s','manual',5,2);
    
    insert into kmodes_input values('lotus europa',1.710546517,4,-1.094265808,-0.491337378,0.324377029,-1.741772228,-0.530934604,'v','manual',5,2);
    
    insert into kmodes_input values('ford pantera l',-0.71190675,8,0.970464681,1.711020886,1.166003916,-0.048290296,-1.874010283,'s','manual',5,4);
    
    insert into kmodes_input values('ferrari dino',-0.064813069,6,-0.691647397,0.412942174,0.043834734,-0.457097039,-1.314395417,'s','manual',5,6);
    
    insert into kmodes_input values('maserati bora',-0.844643915,8,0.567039419,2.746566825,-0.105787824,0.360516446,-1.818048797,'s','manual',5,8);
    
    insert into kmodes_input values('volvo 142e',0.217253407,4,-0.885291523,-0.549677994,0.960272899,-0.44687687,0.420410668,'v','manual',4,2);
    
    
  5. As an example, run this KModes query:
    SELECT * FROM KModes (
    ON kmodes_input AS InputTable
    OUT TABLE OutputTable (kmodes_clusters1)
    USING
    NumClusters (3)
    TargetColumns ('mpg:carb')
    NumericAsCategorical('cyl','gear','carb')
    ) AS dt ;
    The output should be similar to what is shown in the table, but displayed vertically.
    Because KModes is a random algorithm, the numbers in the output will not be exactly the same.
    set_id summary between_cluster_error total_within_cluster_error pseudo_f
    0

    Number of Clusters: 3

    Number of Iterations: 3

    Model Converged: true

    Number of Data Points: 32.0

    194.95312046325273 115.04687954956962 16.380686190908744
PostrequisiteClean up the test tables:
drop table kmodes_input;
drop table kmodes_clusters1;