Input
This input is diabetes data from "Least Angle Regression," by Bradley Efron and others.
The InputTable, diabetes, has one response (vector y) and ten baseline predictors measured on 442 diabetes patients. The baseline predictors are age, sex, body mass index (bmi), mean arterial pressure (map) and six blood serum measurements (tc, ldl, hdl, tch, ltg, glu).
The column id is the row identifier, y is the response, and the other columns are predictors.
- The value of the Normalize argument is irrelevant.
- If the value of the Intercept argument is 'true', then the intercept is considered to be constant along the entire path (which is typically not true).
id | age | sex | bmi | map | tc | ldl | hdl | tch | ltg | glu | y |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 0.0380759 | 0.0506801 | 0.0616962 | 0.0218724 | -0.0442235 | -0.0348208 | -0.0434008 | -0.00259226 | 0.0199084 | -0.0176461 | 151 |
2 | -0.00188202 | -0.0446416 | -0.0514741 | -0.0263278 | -0.00844872 | -0.0191633 | 0.0744116 | -0.0394934 | -0.0683297 | -0.092204 | 75 |
3 | 0.0852989 | 0.0506801 | 0.0444512 | -0.00567061 | -0.0455994 | -0.0341945 | -0.0323559 | -0.00259226 | 0.00286377 | -0.0259303 | 141 |
4 | -0.0890629 | -0.0446416 | -0.011595 | -0.0366564 | 0.0121906 | 0.0249906 | -0.0360376 | 0.0343089 | 0.022692 | -0.00936191 | 206 |
5 | 0.00538306 | -0.0446416 | -0.0363847 | 0.0218724 | 0.00393485 | 0.0155961 | 0.00814208 | -0.00259226 | -0.0319914 | -0.0466409 | 135 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
SQL Call
SELECT * FROM LAR ( ON diabetes AS InputTable OUT TABLE OutputTable (diabetes_lars) USING TargetColumns ('y', 'age', 'sex', 'bmi', 'map', 'tc', 'ldl', 'hdl', 'tch', 'ltg', 'glu') FitMethod ('lar') Intercept ('true') L2Normalization ('true') MaxIterNum (20) ) AS dt;
Output
message |
---|
Successful. Result has been stored in the table specified in the argument OutputTable. |
This query returns the following table:
SELECT * FROM diabetes_lars WHERE steps <> 0 ORDER BY steps;
steps | var_id | var_name | max_abs_corr | step_length | intercept | age | sex | bmi | map | tc | ldl | hdl | tch | ltg | glu |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 3 | bmi | 949.435 | 60.1193 | 152.133 | 0 | 0 | 60.1193 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 9 | ltg | 889.316 | 513.224 | 152.133 | 0 | 0 | 361.895 | 0 | 0 | 0 | 0 | 0 | 301.775 | 0 |
3 | 4 | map | 452.901 | 175.553 | 152.133 | 0 | 0 | 434.758 | 79.2364 | 0 | 0 | 0 | 0 | 374.916 | 0 |
4 | 7 | hdl | 316.074 | 259.367 | 152.133 | 0 | 0 | 505.66 | 191.27 | 0 | 0 | -114.101 | 0 | 439.665 | 0 |
5 | 2 | sex | 130.131 | 88.6592 | 152.133 | 0 | -74.9165 | 511.348 | 234.155 | 0 | 0 | -169.711 | 0 | 450.667 | 0 |
6 | 10 | glu | 88.7824 | 43.6779 | 152.133 | 0 | -111.979 | 512.044 | 252.527 | 0 | 0 | -196.045 | 0 | 452.393 | 12.0781 |
7 | 5 | tc | 68.9652 | 135.984 | 152.133 | 0 | -197.757 | 522.265 | 297.16 | -103.946 | 0 | -223.926 | 0 | 514.75 | 54.7677 |
8 | 8 | tch | 19.9813 | 54.0156 | 152.133 | 0 | -226.134 | 526.885 | 314.389 | -195.106 | 0 | -152.477 | 106.343 | 529.916 | 64.4874 |
9 | 6 | ldl | 5.47747 | 5.56726 | 152.133 | 0 | -227.176 | 526.391 | 314.95 | -237.341 | 33.6284 | -134.599 | 111.384 | 545.483 | 64.6067 |
10 | 1 | age | 5.08918 | 73.5291 | 152.133 | -10.0122 | -239.819 | 519.84 | 324.39 | -792.184 | 476.746 | 101.045 | 177.064 | 751.279 | 67.6254 |
The following figure represents the results and shows how the standardized coefficients evolved during the model-building process. The x-axis represents the ratio of the norm of the current beta to the full beta. The y-axis represents the standardized coefficients, which are estimated when standardized predictors are used. The numbers on the top of the graph represent the steps of the model-building process. The numbers on the right represent the predictor IDs.