AdaBoost Example 1 - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example uses home sales data to create a model that predicts home style, which can be input to the AdaBoostPredict Example 1.

Input

  • AttributeTable: housing_train_att, created by inputting housing_train (the raw data for DecisionForest Example 1: Classification Tree without Out-of-Bag Error) to the Unpivoting function:
    CREATE MULTISET TABLE housing_train_att AS (
      SELECT * FROM Unpivoting (
        ON housing_train
        USING
        TargetColumns ('price', 'lotsize', 'bedrooms', 'bathrms',
          'stories', 'driveway', 'recroom', 'fullbase', 'gashw', 'airco',
          'garagepl', 'prefarea')
        Accumulate ('sn')
      ) AS dt
    ) WITH DATA;
  • CategoricalAttributeTable: housing_cat, created and populated with this code:
    CREATE MULTISET TABLE housing_cat (attribute VARCHAR);
    
    INSERT INTO housing_cat VALUES ('driveway');
    INSERT INTO housing_cat VALUES ('recroom');
    INSERT INTO housing_cat VALUES ('fullbase');
    INSERT INTO housing_cat VALUES ( 'gashw');
    INSERT INTO housing_cat VALUES ('airco');
    INSERT INTO housing_cat VALUES ('prefarea');
  • ResponseTable: housing_train_response, created by selecting the columns sn and homestyle from housing_train:
    CREATE MULTISET TABLE housing_train_response AS (
      SELECT sn, homestyle AS response
      FROM housing_train
    ) WITH DATA;

This query returns the following table:

SELECT * FROM housing_train_att ORDER BY 1, 2;
housing_train_att
sn attribute value_col
1 airco no
1 bathrms 1
1 bedrooms 3
1 driveway yes
1 fullbase yes
1 garagepl 1
1 gashw no
1 lotsize 5850.0
1 prefarea no
1 price 42000.0
1 recroom no
1 stories 2
2 airco no
2 bathrms 1
2 bedrooms 2
2 fullbase no
2 garagepl 0
2 gashw no
2 lotsize 4000.0
2 prefarea no
2 price 38500.0
2 recroom no
2 stories 1
... ... ...

This query returns the following table:

SELECT * FROM housing_cat ORDER BY 1;
housing_cat
attribute
airco
driveway
fullbase
gashw
prefarea
recroom

This query returns the following table:

SELECT * FROM housing_train_response ORDER BY 1;
housing_train_response
sn response
1 Classic
2 Classic
3 Classic
4 Eclectic
5 Eclectic
6 Eclectic
7 Eclectic
8 Eclectic
9 Eclectic
10 Eclectic
... ...

SQL Call

This call creates the model, abd_model, using the default values for the optional arguments.

SELECT * FROM AdaBoost (
  ON housing_train_att AS AttributeTable
  ON housing_cat AS CategoricalAttributeTable
  ON housing_train_response AS ResponseTable
  OUT TABLE OutputTable (abd_model)
  USING
  IdColumns ('sn')
  AttributeNameColumns ('attribute')
  AttributeValueColumn ('value_col')
  ResponseColumn ('response')
  IterNum (20)
  NumSplits (10)
  MaxDepth (3)
  MinNodeSize (100)
) AS dt ;

Output

Because the argument IterNum has the value 20, the function builds 20 classification trees.

message
Running 20 round AdaBoost, computing 20 classification trees.
AdaBoost model created.

This query returns the following table:

SELECT * FROM abd_model ORDER BY 1,3;
abd_model
classifier_id classifier_weight node_id node_label node_majorfreq attribute split_value left_bucket right_bucket left_label right_label left_majorfreq right_majorfreq
1 2.82867818889083 0 Eclectic 0.601626016260163 price 49500     Classic Eclectic 0.241869918699188 0.601626016260163
1 2.82867818889083 2 Eclectic 0.601626016260163 price 90000     Eclectic bungalow 0.538617886178861 0.113821138211382
1 2.82867818889083 5 Eclectic 0.538617886178861 price 55000     Eclectic Eclectic 0.0792682926829268 0.459349593495934
2 3.284500712218 0 Eclectic 0.598193473193469 price 55000     Classic Eclectic 0.359382284382285 0.568648018648015
2 3.284500712218 1 Classic 0.359382284382285 gashw NaN yes no Eclectic Classic 0.00303030303030304 0.357867132867133
2 3.284500712218 2 Eclectic 0.568648018648017 garagepl 1     Eclectic Eclectic 0.257575757575758 0.311072261072259
2 3.284500712218 4 Classic 0.357867132867133 fullbase NaN yes no Classic Classic 0.0294871794871793 0.328379953379953
2 3.284500712218 5 Eclectic 0.257575757575758 lotsize 16200     Eclectic bungalow 0.257575757575758 0.000757575757575758
2 3.284500712218 6 Eclectic 0.311072261072261 stories 4     Eclectic Eclectic 0.295221445221445 0.0158508158508158
3 2.87425191371606 0 Eclectic 0.458459069137445 price 95000     Eclectic bungalow 0.384960363716916 0.398822168720203
3 2.87425191371606 1 Eclectic 0.384960363716916 price 51500     Classic Eclectic 0.142718762142357 0.348728479658944
3 2.87425191371606 3 Classic 0.142718762142357 garagepl 2     Classic Eclectic 0.13649642855846 0.0144927536231885
... ... ... ... ... ... ...   ... ... ... ... ...