Input
- InputTable: boston_sparse, created by inputting boston (InputTable for XGBoost Example: Regression, Dense Format) to the Unpivoting function:
CREATE MULTISET TABLE boston_sparse AS ( SELECT * FROM Unpivoting ( ON boston USING TargetColumns ('crim','zn','indus','chas','nox','rm', 'age','dis','rad','tax','ptratio','black','lstat') AttributeColumn ('attribute') ValueColumn ('value1') InputTypes ('false') Accumulate ('id', 'medv') ) AS dt ) WITH data;
id medv attribute value1 --- ---- --------- ------ 265 36 crim 1 265 36 ptratio 13 265 36 lstat 8 265 36 age 92 265 36 black 388 265 36 dis 2 265 36 nox 1 265 36 chas 0 265 36 zn 20 244 24 lstat 5 244 24 rad 6 244 24 black 375 244 24 rm 6 244 24 age 8 244 24 dis 7 244 24 zn 30 265 36 indus 4 265 36 rm 7 265 36 rad 5 265 36 tax 264
- AttributeTable: sparse_boston_attributes
attributename attributetype ------------- ------------- dis 1 rad 1 ptratio 1 age 1 chas 1 nox 1 zn 1 crim 1 indus 1 tax 1 rm 1 black 1
SQL Call
SELECT * FROM XGBoost ( ON boston AS InputTable ON sparse_boston_attributes AS AttributeTable OUT TABLE OutputTable (xgboost_regression_model) USING ResponseColumn ('medv') PredictionType ('regression') LossFunction ('mse') AttributeNameColumn ('attribute') AttributeValueColumn ('value1') IterNum (5) MaxDepth (10) MinNodeSize (1) RegularizationLambda ('1') ShrinkageFactor ('0.1') IDColumn ('id') NumBoostedTrees ('1') ) AS dt;
Output
message ---------------------------------------------------------------- Parameters: Number of boosting iterations : 5 Number of boosted trees : 2 Number of total trees (all subtrees): 10 Prediction Type : REGRESSION LossFunction : MSE Regularization : 1.0 Shrinkage : 0.1 MaxDepth : 10 MinNodeSize : 1 Variance : 0.0 Seed : null ColumnSubSampling Features: 12 XGBoost model created in table specified in OutputTable argument
This query returns the following table:
SELECT tree_id, iter, class_num, CAST (tree AS VARCHAR(30)) AS tree, CAST(region_prediction AS VARCHAR(30)) AS region_pred FROM xgboost_regression_model ORDER BY 1,2;
The column tree is shown truncated to 80 characters.
tree_id iter class_num tree ------- ---- --------- -------------------------------------------------------- 0 4 0 {"sum_":-7.259742420000081,"sumSq_":20959.84902615756,"s 0 2 0 {"sum_":-4.656698140000127,"sumSq_":20907.042531013678," 0 3 0 {"sum_":-5.843054700000081,"sumSq_":20932.994992147258," 0 1 0 {"sum_":-1.297999986085152E-4,"sumSq_":20895.56742119848 1 3 0 {"sum_":-11.721289920000102,"sumSq_":21945.56968946764," 1 1 0 {"sum_":-2.065200004339185E-4,"sumSq_":21818.30039547843 1 2 0 {"sum_":-9.681986649999978,"sumSq_":21913.84304678934,"s 0 5 0 {"sum_":-8.692206329999848,"sumSq_":20986.778941833785," 1 5 0 {"sum_":-14.901439650000153,"sumSq_":22007.848794685153, -1 -1 -1 {"classifier":"REGRESSION","lossType":"MSE","numIteratio 1 4 0 {"sum_":-13.322867280000047,"sumSq_":21976.772296942378,
Download a zip file of all examples and a SQL script file that creates their input tables.