1.1 - 8.10 - DecisionForest Example: TreeType ('classification'), OutOfBag ('false') - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

This example uses home sales data to create a classification tree that predicts home style, which can be input to the DecisionForestPredict_MLE Example: Omit Responses. By default, the function does not output the out-of-bag estimate of error rate.

Input

The following table describes the home sales data contained in the InputTable. There are six numerical predictors and six categorical predictors. The response variable is homestyle.

Input Data Descriptions
Column Description
price Sale price in U. S. dollars (numeric)
lotsize Lot size in square feet (numeric)
bedrooms Number of bedrooms (numeric)
bathrms Number of full bathrooms (numeric)
stories Number of stories, excluding basement (numeric)
driveway Whether the house has a driveway—yes or no (categorical)
recroom Whether the house has a recreation room—yes or no (categorical)
fullbase Whether the house has a full finished basement—yes or no (categorical)
gashw Whether the house uses gas to heat water—yes or no (categorical)
airco Whether the house has central air conditioning—yes or no (categorical)
garagepl Number of garage places (numeric)
prefarea Whether the house is in a preferred neighborhood—yes or no (categorical)
homestyle Style of home (response variable)
InputTable: housing_train
sn price lotsize bedrooms bathrms stories driveway recroom fullbase gashw airco garagepl prefarea homestyle
1 42000 5850 3 1 2 yes no yes no no 1 no Classic
2 38500 4000 2 1 1 yes no no no no 0 no Classic
3 49500 3060 3 1 1 yes no no no no 0 no Classic
4 60500 6650 3 1 2 yes yes no no no 0 no Eclectic
5 61000 6360 2 1 1 yes no no no no 0 no Eclectic
6 66000 4160 3 1 1 yes yes yes no yes 0 no Eclectic
7 66000 3880 3 2 2 yes no yes no no 2 no Eclectic
8 69000 4160 3 1 3 yes no no no no 0 no Eclectic
9 83800 4800 3 1 1 yes yes yes no no 0 no Eclectic
10 88500 5500 3 2 4 yes yes no no yes 1 no Eclectic
11 90000 7200 3 2 1 yes no yes no yes 3 no Eclectic
12 30500 3000 2 1 1 no no no no no 0 no Classic
14 36000 2880 3 1 1 no no no no no 0 no Classic
15 37000 3600 2 1 1 yes no no no no 0 no Classic
... ... ... ... ... ... ... ... ... ... ... ... ... ...

SQL Call

This call uses default values for the MaxDepth, MinNodeSize, and Variance syntax elements, and builds 50 trees on two worker nodes. It sets both seed values to 100 for repeatability. Because TreeType is 'classification' and there are 12 prediction variables, Mtry is 3 (round(sqrt(12)). By default, OutOfBag is 'false' and IDColumn is the first InputTable column.

SELECT * FROM DecisionForest (
  ON housing_train AS InputTable
  OUT TABLE OutputTable (rft_model)
  OUT TABLE OutputMessageTable (rf_monitortable)
  USING
  TreeType ('classification')
  ResponseColumn ('homestyle')
  NumericInputs ('price','lotsize','bedrooms','bathrms','stories','garagepl')
  CategoricalInputs
   ('driveway','recroom','fullbase','gashw','airco','prefarea')
  MaxDepth (12)
  MinNodeSize (1)
  NumTrees (50)
  Variance (0.0)
  Mtry ('3')
  MtrySeed ('100')
  Seed ('100')
) AS dt;

Output

 message                                          
 ------------------------------------------------ 
 Computing 50 classification trees.              
 Each worker is computing 25 trees.              
 Each tree will contain approximately 246 points.
 Poisson sampling parameter: 1.00                
 Query finished in 1.997 seconds.                
 Decision forest created.
SELECT task_index, tree_num, CAST (tree AS VARCHAR(50))
  FROM rft_model ORDER BY 1;
 task_index tree_num tree                                               
 ---------- -------- -------------------------------------------------- 
          0       20 {"responseCounts_":{"classic":73,"bungalow":37,"ec
          0       23 {"responseCounts_":{"classic":60,"bungalow":48,"ec
          0       13 {"responseCounts_":{"classic":73,"bungalow":40,"ec
          0       11 {"responseCounts_":{"classic":69,"bungalow":32,"ec
          0        7 {"responseCounts_":{"classic":63,"bungalow":45,"ec
          0        2 {"responseCounts_":{"classic":70,"bungalow":26,"ec
          0       14 {"responseCounts_":{"classic":60,"bungalow":42,"ec
          0       17 {"responseCounts_":{"classic":71,"bungalow":34,"ec
          0       21 {"responseCounts_":{"classic":71,"bungalow":42,"ec
          0       22 {"responseCounts_":{"classic":67,"bungalow":43,"ec
          0        8 {"responseCounts_":{"classic":63,"bungalow":41,"ec
          0        5 {"responseCounts_":{"classic":71,"bungalow":42,"ec
          0        4 {"responseCounts_":{"classic":73,"bungalow":24,"ec
          0        6 {"responseCounts_":{"classic":81,"bungalow":52,"ec
          0        3 {"responseCounts_":{"classic":84,"bungalow":37,"ec
          0        9 {"responseCounts_":{"classic":70,"bungalow":38,"ec
          0       19 {"responseCounts_":{"classic":75,"bungalow":24,"ec
          0       12 {"responseCounts_":{"classic":70,"bungalow":27,"ec
          0       24 {"responseCounts_":{"classic":79,"bungalow":40,"ec
          0       16 {"responseCounts_":{"classic":57,"bungalow":31,"ec
          0        0 {"responseCounts_":{"classic":69,"bungalow":36,"ec
          0       10 {"responseCounts_":{"classic":76,"bungalow":40,"ec
          0       18 {"responseCounts_":{"classic":59,"bungalow":41,"ec
          0       15 {"responseCounts_":{"classic":71,"bungalow":36,"ec
          0        1 {"responseCounts_":{"classic":70,"bungalow":37,"ec
          3       12 {"responseCounts_":{"classic":80,"bungalow":15,"ec
          3       22 {"responseCounts_":{"classic":59,"bungalow":21,"ec
          3        3 {"responseCounts_":{"classic":64,"bungalow":24,"ec
          3        6 {"responseCounts_":{"classic":88,"bungalow":27,"ec
          3        5 {"responseCounts_":{"classic":79,"bungalow":19,"ec
          3        2 {"responseCounts_":{"classic":74,"bungalow":23,"ec
          3        0 {"responseCounts_":{"classic":68,"bungalow":23,"ec
          3       15 {"responseCounts_":{"classic":73,"bungalow":26,"ec
          3       23 {"responseCounts_":{"classic":73,"bungalow":16,"ec
          3       24 {"responseCounts_":{"classic":77,"bungalow":19,"ec
          3       21 {"responseCounts_":{"classic":83,"bungalow":25,"ec
          3       14 {"responseCounts_":{"classic":84,"bungalow":16,"ec
          3       11 {"responseCounts_":{"classic":73,"bungalow":28,"ec
          3       19 {"responseCounts_":{"classic":75,"bungalow":17,"ec
          3        9 {"responseCounts_":{"classic":76,"bungalow":29,"ec
          3        7 {"responseCounts_":{"classic":79,"bungalow":24,"ec
          3       17 {"responseCounts_":{"classic":68,"bungalow":22,"ec
          3       18 {"responseCounts_":{"classic":81,"bungalow":21,"ec
          3       20 {"responseCounts_":{"classic":68,"bungalow":19,"ec
          3       16 {"responseCounts_":{"classic":81,"bungalow":17,"ec
          3        4 {"responseCounts_":{"classic":72,"bungalow":17,"ec
          3       13 {"responseCounts_":{"classic":61,"bungalow":27,"ec
          3        1 {"responseCounts_":{"classic":79,"bungalow":26,"ec
          3       10 {"responseCounts_":{"classic":69,"bungalow":21,"ec
          3        8 {"responseCounts_":{"classic":82,"bungalow":17,"ec

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.