This example calculates the overall variable importance by averaging the importance over 50 trees.
Input
- Input table: rft_model, output by any of the DecisionForest Examples
SQL Call
SELECT variable_col, SUM(importance)/50 FROM DecisionForestEvaluator ( ON rft_model ) AS dt GROUP BY variable_col ORDER BY 2 DESC;
Output
Variable importance is in descending order. The top three variables for modeling and prediction are price, lotsize, and bedrooms.
variable_col | sum(importance) / 50 |
---|---|
price | 0.530036819315194 |
lotsize | 0.40869314472933 |
bedrooms | 0.216136248043658 |
stories | 0.176956469036925 |
bathrms | 0.171395287455378 |
garagepl | 0.16108831869553 |
fullbase | 0.0853787807623518 |
airco | 0.0720778853448971 |
recroom | 0.0607107804514478 |
driveway | 0.0336033805550212 |
gashw | 0.0161230714649009 |
prefarea | 0.00464901131486607 |