Input
The input table, lungcancer, contains data from a randomized trial of two treatment regimens for lung cancer used to model survival analysis. There are three categorical predictors and three numerical predictors:
Predictor | Description | Possible Values |
---|---|---|
trt | Treatment plan (categorical) |
|
celltype | Cancerous cell type (categorical) |
|
prior | Whether the patient has undergone prior therapy (categorical) |
|
karno | Karnofsky score assigned by patient (numerical) | [0, 100], where 100 is perfect health and 0 is death |
diagtime | Months from diagnosis to randomization (numerical) | Nonnegative number |
age | Patient age, in years (numerical) | Nonnegative number |
In addition to a column for each predictor, the input table has these columns:
Column | Description | Possible Values |
---|---|---|
id | Patient identifier | Positive integer |
status | Censoring status or survival event |
|
time_int | Survival time in months | Nonnegative number |
id | trt | celltype | time_int | status | karno | diagtime | age | prior |
---|---|---|---|---|---|---|---|---|
1 | standard | squamous | 72 | 1 | 60 | 7 | 69 | no |
2 | standard | squamous | 411 | 1 | 70 | 5 | 64 | yes |
3 | standard | squamous | 228 | 1 | 60 | 3 | 38 | no |
4 | standard | squamous | 126 | 1 | 60 | 9 | 63 | yes |
5 | standard | squamous | 118 | 1 | 70 | 11 | 65 | yes |
6 | standard | squamous | 10 | 1 | 20 | 5 | 49 | no |
7 | standard | squamous | 82 | 1 | 40 | 10 | 69 | yes |
8 | standard | squamous | 110 | 1 | 80 | 29 | 68 | no |
9 | standard | squamous | 314 | 1 | 50 | 18 | 43 | no |
10 | standard | squamous | 100 | 0 | 70 | 6 | 70 | no |
... | ... | ... | ... | ... | ... | ... | ... | ... |
SQL Call
SELECT * FROM CoxPH ( ON lungcancer AS InputTable OUT TABLE CoefficientTable (lungcancer_coef) OUT TABLE LinearPredictorTable (lungcancer_lp) USING FeatureColumns ('trt', 'celltype', 'karno', 'diagtime', 'age', 'prior') CategoricalColumns ('trt','celltype','prior') TimeIntervalColumn ('time_int') EventColumn ('status') ) AS dt;
Output
Coefficients are estimated at 95% CI. Coefficients of variables karno, squamous, and large celltype are significant.
predictor | category | coefficient | exp_coef | std_error | std_error | z_score | p_value | significance |
---|---|---|---|---|---|---|---|---|
karno | -0.032815 | 0.967717 | 0.005508 | 0.005508 | -5.95802 | 0 | *** | |
diagtime | 8.1e-05 | 1.000081 | 0.009136 | 0.009136 | 0.008901 | 0.992898 | ||
age | -0.008706 | 0.991331 | 0.0093 | 0.0093 | -0.93615 | 0.349196 | ||
trt | standard | 0 | 1 | 0 | 0 | |||
trt | test | 0.294603 | 1.342593 | 0.20755 | 0.20755 | 1.419433 | 0.155773 | |
celltype | adeno | 0 | 1 | 0 | 0 | |||
celltype | large | -0.794775 | 0.451683 | 0.302878 | 0.302878 | -2.624078 | 0.008688 | ** |
celltype | smallcell | -0.334506 | 0.715692 | 0.275978 | 0.275978 | -1.212075 | 0.225483 | |
celltype | squamous | -1.196066 | 0.302381 | 0.300917 | 0.300917 | -3.974739 | 7e-05 | *** |
prior | no | 0 | 1 | 0 | 0 | |||
prior | yes | 0.071594 | 1.074219 | 0.232305 | 0.232305 | 0.308187 | 0.75794 | |
Iteration # | 5 | yes | ||||||
Convergence | 0 | on 8 degree of freedom | ||||||
Likelihood ratio test | 62.1039 | 0 | on 8 degree of freedom | |||||
Wald test | 62.3673 | 0 | on 8 degree of freedom | |||||
Score test | 66.7375 | 0.005508 | -5.95802 | 0 | *** |
The coefficients are output in the table lungcancer_coef, which is later used for prediction. Because celltype, trt and prior are categorical variables, one of their categories is considered a reference for the other categories; thus trt = standard, celltype = adeno, and prior = no don not show default coefficient values in each column.
This query returns the following table:
SELECT * FROM lungcancer_coef ORDER BY 1;
id | predictor | category | coefficient | exp_coef | std_error | z_score | p_value | significance |
---|---|---|---|---|---|---|---|---|
1 | karno | -0.0328153261941663 | 0.967717255116806 | 0.00550775688646227 | -5.95802009250341 | 2.55312138097707e-09 | *** | |
2 | diagtime | 8.13205087074416e-05 | 1.00008132381531 | 0.00913606224777197 | 0.00890104582280767 | 0.992898086742234 | ||
3 | age | -0.00870647494549903 | 0.991331316650765 | 0.00930029912031493 | -0.936149991829963 | 0.349195966726043 | ||
4 | trt | standard | 0 | 1 | 0 | NaN | NaN | |
5 | trt | test | 0.294602821498042 | 1.34259300369677 | 0.207549603603519 | 1.41943331320844 | 0.155772725980423 | |
6 | celltype | adeno | 0 | 1 | 0 | NaN | NaN | |
7 | celltype | large | -0.794774719851903 | 0.451682978670067 | 0.30287771543449 | -2.62407790124726 | 0.00868839104930008 | ** |
8 | celltype | smallcell | -0.334505911425932 | 0.71569161405743 | 0.275977786191144 | -1.21207549362053 | 0.225483483597614 | |
9 | celltype | squamous | -1.19606637417932 | 0.302381330550752 | 0.300916994493076 | -3.974738536101 | 7.04566159376308e-05 | *** |
10 | prior | no | 0 | 1 | 0 | NaN | NaN | |
11 | prior | yes | 0.0715936019179389 | 1.07421869492581 | 0.232305384067305 | 0.308187441308705 | 0.757939708088376 |
This query returns the following table:
SELECT * FROM lungcancer_lp ORDER BY 1;
linear_predictor | event | time_internal |
---|---|---|
-4.41189466565077 | 1 | 467 |
-4.41097447125404 | 1 | 110 |
-4.39448171575977 | 1 | 389 |
-4.29871049135928 | 1 | 283 |
-4.28779288293039 | 0 | 182 |
-4.26989200998715 | 1 | 143 |
-4.25242310919077 | 1 | 999 |
-4.20170368038227 | 0 | 25 |
-4.10210453090365 | 0 | 100 |
-4.04859022189228 | 1 | 112 |
... | ... | ... |