DecisionForestEvaluator Example 2: Variable Importance - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
lmf1502735330121
Product Category
Teradata Vantage

This example calculates the overall variable importance by averaging the importance over 50 trees.

Input

SQL Call

SELECT variable_col, SUM(importance)/50
  FROM DecisionForestEvaluator (
   ON rft_model
  ) AS dt GROUP BY variable_col ORDER BY 2 DESC;

Output

Variable importance is in descending order. The top three variables for modeling and prediction are price, lotsize, and bedrooms.

variable_col sum(importance) / 50
price 0.530036819315194
lotsize 0.40869314472933
bedrooms 0.216136248043658
stories 0.176956469036925
bathrms 0.171395287455378
garagepl 0.16108831869553
fullbase 0.0853787807623518
airco 0.0720778853448971
recroom 0.0607107804514478
driveway 0.0336033805550212
gashw 0.0161230714649009
prefarea 0.00464901131486607