1.0 - 8.00 - DecisionForestEvaluator Example 2: Variable Importance - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)

This example calculates the overall variable importance by averaging the importance over 50 trees.

Input

SQL Call

SELECT variable_col, SUM(importance)/50
  FROM DecisionForestEvaluator (
   ON rft_model
  ) AS dt GROUP BY variable_col ORDER BY 2 DESC;

Output

Variable importance is in descending order. The top three variables for modeling and prediction are price, lotsize, and bedrooms.

variable_col sum(importance) / 50
price 0.530036819315194
lotsize 0.40869314472933
bedrooms 0.216136248043658
stories 0.176956469036925
bathrms 0.171395287455378
garagepl 0.16108831869553
fullbase 0.0853787807623518
airco 0.0720778853448971
recroom 0.0607107804514478
driveway 0.0336033805550212
gashw 0.0161230714649009
prefarea 0.00464901131486607