1.1 - 8.10 - XGBoost Example: Sparse Input Format - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

Input

  • InputTable: sparse_housing_train_binary, created by inputting housing_train_binary (InputTable for XGBoost Example: Binary Classification) to the Unpivoting function:
    DROP TABLE sparse_housing_train_binary;
    CREATE MULTISET TABLE sparse_housing_train_binary AS ( 
      SELECT sn, homestyle, attribute, value_col 
        FROM Unpivoting (
          ON housing_train_binary
          USING
          TargetColumns (
            'price','lotsize','bedrooms','bathrms','stories',
            'garagepl','driveway','recroom','fullbase','gashw','airco','prefarea'
          )
          Accumulate('sn','homestyle')
      ) AS dt
    ) WITH DATA;
  • AttributeTable: sparse_housing_attributes, created and populated with this code:
    DROP TABLE sparse_housing_attributes;
    
    CREATE MULTISET TABLE sparse_housing_attributes (
      attributename VARCHAR(30), attributetype INTEGER
    );
    
    INSERT INTO sparse_housing_attributes VALUES
    ('price',1),
    ('lotsize',1),
    ('bedrooms',1),
    ('bathrms',1),
    ('stories',1),
    ('garagepl',1),
    ('driveway',0),
    ('recroom',0),
    ('fullbase',0),
    ('gashw',0),
    ('airco',0),
    ('prefarea',0);
    
InputTable: sparse_housing_train_binary
sn homestyle attribute value_col
1 Classic airco no
1 Classic bathrms 3
1 Classic bedrooms 1
1 Classic driveway yes
1 Classic fullbase yes
1 Classic garagepl 1
1 Classic gashw no
1 Classic lotsize 5850.0
1 Classic prefarea no
1 Classic price 42000.0
1 Classic recroom no
1 Classic stories 2
2 Classic airco no
2 Classic bathrms 1
2 Classic bedrooms 2
2 Classic driveway yes
... ... ... ...
AttributeTable: sparse_housing_attributes
price lotsize bedrooms bathrms stories garagepl driveway recroom fullbase gashw airco prefarea
1 1 1 1 1 1 0 0 0 0 0 0

SQL Call

DROP TABLE sparse_housing_xgboost_model ;

SELECT * FROM XGBoost (
  ON sparse_housing_train_binary AS InputTable
  ON sparse_housing_attributes AS AttributeTable
  OUT TABLE OutputTable (sparse_housing_xgboost_model)
  USING
  ResponseColumn ('homestyle')
  PredictionType ('classification') 
  AttributeNameColumn ('attribute')
  AttributeValueColumn ('value_col')
  LossFunction ('binomial')
  IterNum (10)
  MaxDepth (10)
  MinNodeSize (1)
  RegularizationLambda (1)
  ShrinkageFactor (0.1)
  IDColumn ('sn')
  NumBoostedTrees (2)
) AS dt;

Output

 message                                                          
 ---------------------------------------------------------------- 
 Parameters:                                                     
 	Number of boosting iterations : 10                             
 	Number of boosted trees : 2                                    
 	Number of total trees (all subtrees): 20                       
 	Prediction Type : CLASSIFICATION                               
 	LossFunction : BINOMIAL                                        
 	Regularization : 1.0                                           
 	Shrinkage : 0.1                                                
 	MaxDepth : 10                                                  
 	MinNodeSize : 1                                                
 	Variance : 0.0                                                 
 	Seed : 1                                                       
 	ColumnSubSampling Features: 12                                 
 XGBoost model created in table specified in OutputTable argument
SELECT tree_id, iter, class_num, CAST (tree AS VARCHAR(30)) AS tree,
  CAST(region_prediction AS VARCHAR(30)) AS region_pred
    FROM sparse_housing_xgboost_model ORDER BY 1,2;

For simplicity, the last two output columns show only the first 30 characters of each value.

 tree_id iter class_num tree                           region_pred                    
 ------- ---- --------- ------------------------------ ------------------------------ 
      -1   -1        -1 {"classifier":"CLASSIFICATION"                               
       0    1         0 {"sum_":1.2000001063938726E-6, {"1280":-0.16304731,"1792":0.0
       0    2         0 {"sum_":-0.15353446000003235," {"1536":0.0422824,"1537":0.030
       0    3         0 {"sum_":-0.8592291000000191,"s {"1536":0.056829624,"1537":0.0
       0    4         0 {"sum_":-0.12501246000002553," {"1536":0.054619756,"1537":0.0
       0    5         0 {"sum_":0.6978361400000178,"su {"1536":0.037036486,"1537":0.0
       0    6         0 {"sum_":1.24652972000003,"sumS {"1536":0.045559715,"1537":0.0
       0    7         0 {"sum_":1.678402750000007,"sum {"1536":0.03432062,"1537":0.04
       0    8         0 {"sum_":2.0121404699999954,"su {"1536":0.033232495,"1537":0.0
       0    9         0 {"sum_":2.4126530099999894,"su {"1792":0.04346112,"1536":0.03
       0   10         0 {"sum_":2.598574500000006,"sum {"1792":0.044621654,"1793":0.0
       1    1         0 {"sum_":-2.4000000620194584E-6 {"1920":0.07176345,"1664":0.07
       1    2         0 {"sum_":1.2315773000000787,"su {"1536":0.07146741,"1024":-0.1
       1    3         0 {"sum_":0.6282604200000356,"su {"133":-0.05381508,"137":-0.05
       1    4         0 {"sum_":-0.5197018499999984,"s {"10":-0.070136935,"273":-0.08
       1    5         0 {"sum_":-0.619627470000029,"su {"520":-0.061639633,"521":-0.0
       1    6         0 {"sum_":-0.6662540199999845,"s {"525":-0.06725458,"527":-0.06
       1    7         0 {"sum_":-0.9892758200000177,"s {"256":-0.059067618,"1536":0.0
       1    8         0 {"sum_":-1.5109153599999874,"s {"528":-0.040441662,"273":-0.0
       1    9         0 {"sum_":-2.2053655599999726,"s {"256":-0.05370407,"515":-0.03
       1   10         0 {"sum_":-2.737184110000009,"su {"256":-0.037839036,"258":-0.0

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.