CoxPH Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The InputTable, lungcancer, contains data from a randomized trial of two treatment regimens for lung cancer used to model survival analysis. There are three categorical predictors and three numerical predictors:

Predictors
Predictor Description Possible Values
trt Treatment plan (categorical)
  • standard
  • test
celltype Cancerous cell type (categorical)
  • squamous
  • smallcell
  • adeno
  • large
prior Whether the patient has undergone prior therapy (categorical)
  • yes
  • no
karno Karnofsky score assigned by patient (numerical) [0, 100], where 100 is perfect health and 0 is death
diagtime Months from diagnosis to randomization (numerical) Nonnegative number
age Patient age, in years (numerical) Nonnegative number

In addition to a column for each predictor, the InputTable has these columns:

Column Description Possible Values
id Patient identifier Positive integer
status Censoring status or survival event
  • 0 (survival/right censorship)
  • 1
time_int Survival time in months Nonnegative number
lungcancer
id trt celltype time_int status karno diagtime age prior
1 standard squamous 72 1 60 7 69 no
2 standard squamous 411 1 70 5 64 yes
3 standard squamous 228 1 60 3 38 no
4 standard squamous 126 1 60 9 63 yes
5 standard squamous 118 1 70 11 65 yes
6 standard squamous 10 1 20 5 49 no
7 standard squamous 82 1 40 10 69 yes
8 standard squamous 110 1 80 29 68 no
9 standard squamous 314 1 50 18 43 no
10 standard squamous 100 0 70 6 70 no
... ... ... ... ... ... ... ... ...

SQL Call

SELECT * FROM CoxPH (
  ON lungcancer AS InputTable
  OUT TABLE CoefficientTable (lungcancer_coef)
  OUT TABLE LinearPredictorTable (lungcancer_lp)
  USING
  TargetColumns ('trt', 'celltype', 'karno', 'diagtime', 'age', 'prior')
  CategoricalColumns ('trt','celltype','prior')
  TimeIntervalColumn ('time_int')
  EventColumn ('status')
) AS dt;

Output

Coefficients are estimated at 95% CI. Coefficients of variables karno, squamous, and large celltype are significant.

 predictor             category  coefficient exp_coef std_error z_score   p_value  significance           
 --------------------- --------- ----------- -------- --------- --------- -------- ---------------------- 
 karno                 NULL        -0.032815 0.967717  0.005508  -5.95802      0.0 ***                   
 diagtime              NULL           8.1E-5 1.000081  0.009136  0.008901 0.992898                       
 age                   NULL        -0.008706 0.991331    0.0093  -0.93615 0.349196                       
 trt                   standard          0.0      1.0       0.0      NULL     NULL                       
 trt                   test         0.294603 1.342593   0.20755  1.419433 0.155773                       
 celltype              adeno             0.0      1.0       0.0      NULL     NULL                       
 celltype              large       -0.794775 0.451683  0.302878 -2.624078 0.008688 **                    
 celltype              smallcell   -0.334506 0.715692  0.275978 -1.212075 0.225483                       
 celltype              squamous    -1.196066 0.302381  0.300917 -3.974739   7.0E-5 ***                   
 prior                 no                0.0      1.0       0.0      NULL     NULL                       
 prior                 yes          0.071594 1.074219  0.232305  0.308187  0.75794                       
 Iteration #           NULL              5.0     NULL      NULL      NULL     NULL NULL                  
 Convergence           NULL             NULL     NULL      NULL      NULL     NULL yes                   
 Likelihood ratio test NULL          62.1039     NULL      NULL      NULL      0.0 on 8 degree of freedom
 Wald test             NULL          62.3673     NULL      NULL      NULL      0.0 on 8 degree of freedom
 Score test            NULL          66.7375     NULL      NULL      NULL      0.0 on 8 degree of freedom

The coefficients are output in the table lungcancer_coef, which is later used for prediction. Because celltype, trt and prior are categorical variables, one of their categories is considered a reference for the other categories; thus trt = standard, celltype = adeno, and prior = no don not show default coefficient values in each column.

SELECT * FROM lungcancer_coef;
 id predictor category  coefficient           exp_coef            std_error            z_score             p_value               significance 
 -- --------- --------- --------------------- ------------------- -------------------- ------------------- --------------------- ------------ 
  6 celltype  adeno                       0.0                 1.0                  0.0                 NaN                   NaN             
  7 celltype  large       -0.7947747198519008 0.45168297867006735   0.3028777154344899 -2.6240779012472593  0.008688391049300193 **          
  4 trt       standard                    0.0                 1.0                  0.0                 NaN                   NaN             
  1 karno     NULL       -0.03281532619416623   0.967717255116806 0.005507756886462303  -5.958020092503375 2.5531213809770748E-9 ***         
  2 diagtime  NULL       8.132050870746436E-5  1.0000813238153097 0.009136062247771979 0.00890104582281016     0.992898086742232             
  9 celltype  squamous     -1.196066374179321  0.3023813305507525   0.3009169944930761  -3.974738536100997  7.045661593763075E-5 ***         
 10 prior     no                          0.0                 1.0                  0.0                 NaN                   NaN             
  3 age       NULL      -0.008706474945498829  0.9913313166507652  0.00930029912031481 -0.9361499918299531     0.349195966726048             
  8 celltype  smallcell   -0.3345059114259308   0.715691614057431  0.27597778619114416 -1.2120754936205251   0.22548348359761572             
 11 prior     yes          0.0715936019179391  1.0742186949258055  0.23230538406730558 0.30818744130870585    0.7579397080883754             
  5 trt       test        0.29460282149804123  1.3425930036967717  0.20754960360351904  1.4194333132084391   0.15577272598042358

This query returns the following table:

SELECT * FROM lungcancer_lp;
 linear_predictor    event time_interval 
 ------------------- ----- ------------- 
  -2.980877152089424     1             8
 -2.8463872419210983     1             1
  -3.257006840737693     1             8
  -1.564001681356389     1            13
  -2.117355347321969     1            12
  -2.054218838259662     1            21
 -1.7084899854095867     1            16
  -1.624947005974881     1            25
 -1.4221870363135072     1            20
 -2.9462665286870284     1            25
  -2.434789369837646     1            24
  -2.442649811965947     1            33
 -3.4071856019850353     1            44
 -1.8065272956361809     1            49
 -3.1102722260025875     1            52
 -2.9549730036325275     1            61
 -2.0484512112822597     1            52
  -2.886352701880789     0            97
 -3.7651634735077617     1            72
 -3.3599858929260527     1           117
  -2.297446672118591     1            80
 -3.5271609138650435     1           133
  -2.818648125234491     1            92
  -2.937273645905877     1           177
  -3.013183095954195     1           100
  -3.811385585389109     1           260
 -3.0601432402647024     1           132
 -2.6683655462084883     1           384
 -2.7287087995359043     1           144
  -3.375114709644063     1           587
  -3.316471618690865     1           164
 -2.6015691736492057     1             7
 -2.8870579190955397     1           216
 -1.2699424346183548     1             7
  -4.298710491359266     1           283
 -2.8178380916784085     1            15
  -3.978353399821406     1           411
 -1.7110222308341465     1            19
 -1.6610798950650794     1             2
 -2.8425763656271723     1            27
 -2.2785835678485507     1            10
 -2.6699318353835886     1            31
 -1.3552645753895942     1            18
  -2.823220614819555     1            43
 -2.8951404973349955     1            22
 -2.2137115942293226     1            51
 -2.8342764932252105     1            30
 -1.8847239276909267     1            59
 -3.4363210445728862     1            54
  -3.450139582091726     0            83
 -3.0370193861803756     1            82
 -2.5310485172903783     1            87
  -4.410974471254024     1           110
 -2.8679895746862467     1            95
  -3.418898204826858     1           122
 -3.3686819810181166     1            99
  -3.959395659462589     1           162
 -2.5770263684479797     0           103
  -4.287792882930374     0           182
 -3.7380938713677208     1           111
 -3.1516017880658023     1           242
  -3.516783762453739     1           139
  -3.702454766498651     1           357
 -2.5750431462691723     1           151
 -3.5616972880689657     1           991
  -2.922722393567932     0           231
  -1.951682941267624     1             4
  -3.209747337387348     1           314
 -1.1142628039753721     1             8
  -3.500889234864567     1           553
 -1.9203842320696556     1             8
 -2.0503896154212327     1             1
  -2.626858615890042     1            12
 -2.7906627979890093     1            13
 -1.8844799661648044     1            20
 -1.3140440529068012     1            21
 -1.5402360819811227     1            24
  -4.201703680382252     0            25
 -2.5329144196993636     1            36
 -2.1033504598175616     1            25
 -0.7384496289941964     1            48
 -1.9351993949733006     1            29
  -2.860557560516451     1            52
 -1.6185130359819047     1            45
   -3.26154098159325     1            56
 -3.0427429703022666     1            53
 -1.5661928658002042     1            80
  -2.283526034810728     1            73
 -2.7985058367034155     1            84
 -3.9937336361923395     1           105
   -4.10210453090363     0           100
 -2.9559095024448387     1           117
  -4.048590221892262     1           112
 -2.7792013156024873     1           153
 -2.5507339721338984     1           140
  -3.905555769218769     1           201
  -3.666312258829045     1           156
  -3.610205583394069     1           340
 -3.7043968401282226     1           200
 -2.2388339734516705     1           392
 -3.4955880322321278     1           228
 -1.3585942469553143     1             3
 -2.8044262148432018     1           287
 -1.5226604910727102     1             7
  -4.411894665650751     1           467
 -3.8385618776411796     1            11
 -2.1569888069227905     1            10
 -2.0327330032017357     1            15
 -1.8410289119460175     1            18
  -1.752265323100532     1            19
 -1.5467567930465507     1            18
 -3.3613322859196995     1            31
 -3.7461497822435597     1            30
 -1.8519265713353317     1            35
 -3.8698851343798704     1            42
 -1.4593764113618188     1            51
  -3.214831245857281     1            54
 -2.8866779839156185     1            51
 -2.0362578443173707     1            90
  -2.392288492922404     1            63
  -3.986571951714659     1           118
  -3.944356484072399     0            87
 -3.6411683808994146     1           126
  -2.920920961645429     1            95
  -3.182033889501686     1           162
 -2.9635981580693187     1            99
 -3.1809210711807268     1           186
 -3.7504402607706178     1           103
  -3.481046559567376     1           250
  -3.025899263972221     1           111
  -4.394481715759753     1           389
  -2.125731119668893     0           123
  -4.252423109190747     1           999
  -4.269892009987132     1           143
  -3.307521182219244     1           231
  -3.311226366963812     1           278
 -3.6909935833097514     1           378

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.