This example is like GLMPerSegment Example: Regression, Family ('GAUSSIAN'), except that the SQL call specifies AttributeTable and ParameterTable and FitMethod is BFGS (by default). When the function creates a model for each partition, it uses the attributes and parameters that AttributeTable and ParameterTable specify for that partition.
Input
All input tables are partitioned by homestyle.
- housing_train, as in DecisionForest Example: TreeType ('classification'), OutOfBag ('false')
This table has data from three different homestyles—eclectic, classic, and bungalow.
- AttributeTable: att_table
homestyle attribute_col value_col --------- ------------- --------- eclectic airco 0 eclectic driveway 0 eclectic fullbase 0 eclectic gashw 0 eclectic prefarea 0 eclectic recroom 0 classic bathrms 1 classic bedrooms 1 classic garagepl 1 classic lotsize 1 bungalow bathrms 1 bungalow bedrooms 1 bungalow fullbase 0 bungalow recroom 0
- ParameterTable: param_table
homestyle param_col value_col --------- --------- --------- eclectic Alpha 0.5 eclectic Lambda 0.2 classic Alpha 0.3 classic Lambda 1.0 bungalow Alpha 0.1 bungalow Lambda 10.0
SQL Call
CREATE MULTISET TABLE glm_ps_housing_2 AS ( SELECT * FROM GlmPerSegment ( ON housing_train PARTITION BY homestyle ON att_table AS AttributeTable PARTITION BY homestyle ON param_table AS ParameterTable PARTITION BY homestyle USING TargetColumns ('lotsize','bedrooms','bathrms','stories', 'garagepl','driveway','recroom','fullbase','gashw','airco','prefarea') CategoricalColumns ('driveway','recroom','fullbase', 'gashw','airco','prefarea') ResponseColumn ('price') Family ('GAUSSIAN') FeatureScale ('True') ) AS dt ) WITH DATA;
Output
Because the SQL call includes FeatureScale ('True'), the output table has mean and std (standard deviation) columns with a value for each numeric predictor. When you use this output table as input to the GLMPredictPerSegment function, GLMPredictPerSegment uses the mean and std values to scale the corresponding predictor values in its input table.
SELECT * FROM glm_ps_housing_2 ORDER BY homestyle, attribute;
homestyle | attribute | category | estimate | mean | std | information -----------+-----------------+----------+-------------------+-------------------+--------------------+------------- eclectic | (Intercept) | | 41932.7503791151 | | | p eclectic | driveway | yes | 19406.3776174878 | | | p eclectic | recroom | yes | 3918.45364226178 | | | p eclectic | fullbase | yes | 4545.07406239906 | | | p eclectic | gashw | yes | 4856.35157149518 | | | p eclectic | airco | yes | 10244.2578007323 | | | p eclectic | prefarea | yes | 6610.84767566899 | | | p eclectic | driveway | no | | | | p eclectic | recroom | no | | | | p eclectic | fullbase | no | | | | p eclectic | gashw | no | | | | p eclectic | airco | no | | | | p eclectic | prefarea | no | | | | p eclectic | Family | | | | | Gaussian eclectic | Method | | | | | BFGS eclectic | Regularization | | | | | ElasticNet eclectic | Alpha | | 0.5 | | | eclectic | Lambda | | 0.2 | | | eclectic | Encoder | | | | | Onehot eclectic | Iterations # | | 24 | | | eclectic | Converged | | | | | true eclectic | Rows # | | 296 | | | eclectic | Features # | | 7 | | | eclectic | Feature scaling | | | | | true eclectic | RMSE | | 16375.5651548663 | | | eclectic | AIC | | 89151514700.0955 | | | eclectic | BIC | | 89151514725.928 | | | bungalow | (Intercept) | | 21315.5815022484 | | | p bungalow | bedrooms | | 669.587694923986 | 3.44642857142857 | 0.389987244897958 | p bungalow | bathrms | | 1301.04427557357 | 1.875 | 0.395089285714286 | p bungalow | recroom | yes | 6462.92072448963 | | | p bungalow | fullbase | yes | 8058.44829186865 | | | p bungalow | recroom | no | | | | p bungalow | fullbase | no | | | | p bungalow | Family | | | | | Gaussian bungalow | Method | | | | | BFGS bungalow | Regularization | | | | | ElasticNet bungalow | Alpha | | 0.1 | | | bungalow | Lambda | | 10 | | | bungalow | Encoder | | | | | Onehot bungalow | Iterations # | | 33 | | | bungalow | Converged | | | | | true bungalow | Rows # | | 56 | | | bungalow | Features # | | 5 | | | bungalow | Feature scaling | | | | | true bungalow | RMSE | | 110241.184596537 | | | bungalow | AIC | | 1072221848376.08 | | | bungalow | BIC | | 1072221848386.2 | | | classic | (Intercept) | | 31247.4811161212 | | | p classic | lotsize | | 1022.19977367751 | 3959.31428571429 | 2569760.57265306 | p classic | bedrooms | | 459.266928439589 | 2.68571428571429 | 0.558367346938777 | p classic | bathrms | | -18.0372043832458 | 1.05 | 0.0617857142857143 | p classic | garagepl | | 825.54626546107 | 0.321428571428571 | 0.346683673469388 | p classic | Family | | | | | Gaussian classic | Method | | | | | BFGS classic | Regularization | | | | | ElasticNet classic | Alpha | | 0.3 | | | classic | Lambda | | 1 | | | classic | Iterations # | | 22 | | | classic | Converged | | | | | true classic | Rows # | | 140 | | | classic | Features # | | 5 | | | classic | Feature scaling | | | | | true classic | RMSE | | 22448.7745665889 | | | classic | AIC | | 44540024959.1824 | | | classic | BIC | | 44540024973.8906 | | |
Download a zip file of all examples and a SQL script file that creates their input tables.