The example uses the Gaussian modelto evaluate the prediction for new houses, comparing the prediction with the original price information with root mean square error evaluation (RMSE).
Input
- Input table: housing_test, which contains test data for 54 houses
- Model table: glm_housing_model from the output section of GLM Example 3: Gaussian Distribution Analysis with Default Options
sn | price | lotsize | bedrooms | bathrms | stories | driveway | recroom | fullbase | gashw | airco | garagepl | prefarea | homestyle |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
13 | 27000 | 1700 | 3 | 1 | 2 | yes | no | no | no | no | 0 | no | Classic |
16 | 37900 | 3185 | 2 | 1 | 1 | yes | no | no | no | yes | 0 | no | Classic |
25 | 42000 | 4960 | 2 | 1 | 1 | yes | no | no | no | no | 0 | no | Classic |
38 | 67000 | 5170 | 3 | 1 | 4 | yes | no | no | no | yes | 0 | no | Eclectic |
53 | 68000 | 9166 | 2 | 1 | 1 | yes | no | yes | no | yes | 2 | no | Eclectic |
104 | 132000 | 3500 | 4 | 2 | 2 | yes | no | no | yes | no | 2 | no | bungalow |
111 | 43000 | 5076 | 3 | 1 | 1 | no | no | no | no | no | 0 | no | Classic |
117 | 93000 | 3760 | 3 | 1 | 2 | yes | no | no | yes | no | 2 | no | Eclectic |
132 | 44500 | 3850 | 3 | 1 | 2 | yes | no | no | no | no | 0 | no | Classic |
140 | 43000 | 3750 | 3 | 1 | 2 | yes | no | no | no | no | 0 | no | Classic |
142 | 40000 | 2650 | 3 | 1 | 2 | yes | no | yes | no | no | 1 | no | Classic |
157 | 60000 | 2953 | 3 | 1 | 2 | yes | no | yes | no | yes | 0 | no | Eclectic |
161 | 63900 | 3162 | 3 | 1 | 2 | yes | no | no | no | yes | 1 | no | Eclectic |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
SQL Call
The canonical link specifies the default family link, which is "identity" for the Gaussian distribution.
SELECT * FROM GLMPredict@coprocessor ( ON housing_test PARTITION BY ANY ON glm_housing_model AS model DIMENSION USING Accumulate ('sn', 'price') Family ('GAUSSIAN') LinkFunction ('CANONICAL') ) AS dt;
Output
This query returns the following table:
SELECT * FROM glmpredict_housing ORDER BY 1;
The fitted_value column gives the predicted home price.
sn | price | fitted_value |
---|---|---|
13 | 27000 | 37345.844 |
16 | 37900 | 43687.13175 |
25 | 42000 | 40902.028 |
38 | 67000 | 72487.6705 |
53 | 68000 | 79238.6937 |
104 | 132000 | 111528.007 |
111 | 43000 | 39102.8812 |
117 | 93000 | 66936.951 |
132 | 44500 | 41819.8865 |
140 | 43000 | 41611.7915 |
... | ... | ... |
RMSE
This query returns the following table:
SELECT SQRT(AVG(POWER(glmpredict_housing.price - glmpredict_housing.fitted_value, 2))) AS RMSE FROM glmpredict_housing;
rmse |
---|
10246.7521984348 |