Tutorial - Decision Tree Analysis - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2302
Product Category
Software

In this example, a standard Gain Ratio tree was built to predict credit card ownership ccacct based on 20 numeric and categorical input variables. Notice that the tree initially built contained 100 nodes but was pruned back to only 11, counting the root node. This yielded not only a relatively simple tree structure, but also Model Accuracy of 95.72% on this training data.

  1. Parameterize a Decision Tree as follows:
    • Available Tables — twm_customer_analysis
    • Dependent Variable — ccacct
    • Independent Variables
      • income
      • age
      • years_with_bank
      • nbr_children
      • gender
      • marial_status
      • city_name
      • state_code
      • female
      • single
      • married
      • separated
      • ckacct
      • svacct
      • avg_ck_bal
      • avg_sv_bal
      • avg_ck_tran_amt
      • avg_ck_tran_cnt
      • avg_sv_tran_amt
      • avg_sv_tran_cnt
    • Tree Splitting — Gain Ratio
    • Minimum Split Count — 2
    • Maximum Nodes — 1000
    • Maximum Depth — 10
    • Bin Numeric Variables — Disabled
    • Pruning Method — Gain Ratio
    • Include Lift Table — Enabled
    • Response Value — 1
  2. Run the analysis.
  3. Click Results when it completes.

    For this example, the Decision Tree analysis generated the following pages. A single click on each page name populates the page with the item.

    Decision Tree Report
    Total observations 747
    Nodes before pruning 33
    Nodes after pruning 11
    Model Accuracy 95.72%
    Variables: Dependent
    Dependent Variable
    ccacct
    Variables: Independent
    Independent Variables
    income
    ckacct
    avg_sv_bal
    avg_sv_tran_cnt
    Confusion Matrix
      Actual Non-Response Actual Response Correct Incorrect
    Predicted 0 340 / 45.52% 0 / 0.00% 340 / 45.52% 0 / 0.00%
    Predicted 1 32 / 4.28% 375 / 50.20% 375 / 50.20% 32 / 4.28%
    Cumulative Lift Table
    Decile Count Response Response (%) Captured Response (%) Lift Cumulative Response Cumulative Response (%) Cumulative Captured Response (%) Cumulative Lift
    1 5.00 5.00 100.00 1.33 1.99 5.00 100.00 1.33 1.99
    2 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99
    3 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99
    4 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99
    5 0.00 0.00 0.00 0.00 0.00 5.00 100.00 1.33 1.99
    6 402.00 370.00 92.04 98.67 1.83 375.00 92.14 100.00 1.84
    7 0.00 0.00 0.00 0.00 0.00 375.00 92.14 100.00 1.84
    8 0.00 0.00 0.00 0.00 0.00 375.00 92.14 100.00 1.84
    9 0.00 0.00 0.00 0.00 0.00 375.00 92.14 100.00 1.84
    10 340.00 0.00 0.00 0.00 0.00 375.00 50.20 100.00 1.00