Input
- Input: admissions_test, which has admissions information for 20 students
- model: glm_admissions_model, output by GLM Example 1: Logistic Regression Analysis with Intercept
id | masters | gpa | stats | programming | admitted |
---|---|---|---|---|---|
50 | yes | 3.95 | Beginner | Beginner | 0 |
51 | yes | 3.76 | Beginner | Beginner | 0 |
52 | no | 3.7 | Novice | Beginner | 1 |
53 | yes | 3.5 | Beginner | Novice | 1 |
54 | yes | 3.5 | Beginner | Advanced | 1 |
55 | no | 3.6 | Beginner | Advanced | 1 |
56 | no | 3.82 | Advanced | Advanced | 1 |
57 | no | 3.71 | Advanced | Advanced | 1 |
58 | no | 3.13 | Advanced | Advanced | 1 |
59 | no | 3.65 | Novice | Novice | 1 |
60 | no | 4 | Advanced | Novice | 1 |
61 | yes | 4 | Advanced | Advanced | 1 |
62 | no | 3.7 | Advanced | Advanced | 1 |
63 | no | 3.83 | Advanced | Advanced | 1 |
64 | yes | 3.81 | Advanced | Advanced | 1 |
65 | yes | 3.9 | Advanced | Advanced | 1 |
66 | no | 3.87 | Novice | Beginner | 1 |
67 | yes | 3.46 | Novice | Beginner | 0 |
68 | no | 1.87 | Advanced | Novice | 1 |
69 | no | 3.96 | Advanced | Advanced | 1 |
SQL Call
SELECT * FROM GLMPredict@coprocessor ( ON admissions_test PARTITION BY ANY ON glm_admissions_model AS model DIMENSION USING Accumulate ('id', 'masters', 'gpa', 'stats', 'programming', 'admitted') Family ('LOGISTIC') LinkFunction ('LOGIT') ) AS dt;
Output
This query returns the following table:
SELECT * FROM glmpredict_admissions ORDER BY 1;
id | masters | gpa | stats | programming | admitted | fitted_value |
---|---|---|---|---|---|---|
50 | yes | 3.95 | Beginner | Beginner | 0 | 0.35076568493861 |
51 | yes | 3.76 | Beginner | Beginner | 0 | 0.355711270634992 |
52 | no | 3.7 | Novice | Beginner | 1 | 0.758307972010008 |
53 | yes | 3.5 | Beginner | Novice | 1 | 0.556015248287644 |
54 | yes | 3.5 | Beginner | Advanced | 1 | 0.769476126790063 |
55 | no | 3.6 | Beginner | Advanced | 1 | 0.968031451283364 |
56 | no | 3.82 | Advanced | Advanced | 1 | 0.945773239222729 |
57 | no | 3.71 | Advanced | Advanced | 1 | 0.946412421748131 |
58 | no | 3.13 | Advanced | Advanced | 1 | 0.949666663764171 |
59 | no | 3.65 | Novice | Novice | 1 | 0.874190783966478 |
60 | no | 4 | Advanced | Novice | 1 | 0.8650601595144 |
61 | yes | 4 | Advanced | Advanced | 1 | 0.650621000328301 |
62 | no | 3.7 | Advanced | Advanced | 1 | 0.946470175529378 |
63 | no | 3.83 | Advanced | Advanced | 1 | 0.945714776638306 |
64 | yes | 3.81 | Advanced | Advanced | 1 | 0.655525617391707 |
65 | yes | 3.9 | Advanced | Advanced | 1 | 0.653206427288345 |
66 | no | 3.87 | Novice | Beginner | 1 | 0.754740354720256 |
67 | yes | 3.46 | Novice | Beginner | 0 | 0.260036221097295 |
68 | no | 1.87 | Advanced | Novice | 1 | 0.890966489834395 |
69 | no | 3.96 | Advanced | Advanced | 1 | 0.94494933637213 |
Categorizing fitted_value Column
The fitted_value column gives the probability that a student belongs to one of the output classes.
The fitted_value probability of > = 0.5 implies class 1 (student admitted) and a probability of < 0.5 implies class 0 (student rejected).
The following code adds a fitted_category column to glmpredict_admissions and populates it:
ALTER table glmpredict_admissions ADD fitted_category int; UPDATE glmpredict_admissions SET fitted_category = 1 WHERE fitted_value > 0.4999; UPDATE glmpredict_admissions SET fitted_category = 0 WHERE fitted_value < 0.4999;
This query returns the following table:
SELECT * FROM glmpredict_admissions ORDER BY 1;
id | masters | gpa | stats | programming | admitted | fitted_value | fitted_category |
---|---|---|---|---|---|---|---|
50 | yes | 3.95 | Beginner | Beginner | 0 | 0.35076568493861 | 0 |
51 | yes | 3.76 | Beginner | Beginner | 0 | 0.355711270634992 | 0 |
52 | no | 3.7 | Novice | Beginner | 1 | 0.758307972010008 | 1 |
53 | yes | 3.5 | Beginner | Novice | 1 | 0.556015248287644 | 1 |
54 | yes | 3.5 | Beginner | Advanced | 1 | 0.769476126790063 | 1 |
55 | no | 3.6 | Beginner | Advanced | 1 | 0.968031451283364 | 1 |
56 | no | 3.82 | Advanced | Advanced | 1 | 0.945773239222729 | 1 |
57 | no | 3.71 | Advanced | Advanced | 1 | 0.946412421748131 | 1 |
58 | no | 3.13 | Advanced | Advanced | 1 | 0.949666663764171 | 1 |
59 | no | 3.65 | Novice | Novice | 1 | 0.874190783966478 | 1 |
60 | no | 4 | Advanced | Novice | 1 | 0.8650601595144 | 1 |
61 | yes | 4 | Advanced | Advanced | 1 | 0.650621000328301 | 1 |
62 | no | 3.7 | Advanced | Advanced | 1 | 0.946470175529378 | 1 |
63 | no | 3.83 | Advanced | Advanced | 1 | 0.945714776638306 | 1 |
64 | yes | 3.81 | Advanced | Advanced | 1 | 0.655525617391707 | 1 |
65 | yes | 3.9 | Advanced | Advanced | 1 | 0.653206427288345 | 1 |
66 | no | 3.87 | Novice | Beginner | 1 | 0.754740354720256 | 1 |
67 | yes | 3.46 | Novice | Beginner | 0 | 0.260036221097295 | 0 |
68 | no | 1.87 | Advanced | Novice | 1 | 0.890966489834395 | 1 |
69 | no | 3.96 | Advanced | Advanced | 1 | 0.94494933637213 | 1 |
Prediction Accuracy
This query returns the following table:
SELECT (SELECT COUNT(id) FROM glmpredict_admissions WHERE admitted = fitted_category)/(SELECT count(id) FROM glmpredict_admissions) AS prediction_accuracy;
prediction_accuracy |
---|
1.00000000000000000000 |