Tree Scoring - INPUT - Analysis Parameters - Teradata Warehouse Miner

Teradata® Warehouse Miner™ User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.6
Published
November 2018
Language
English (United States)
Last Update
2018-12-07
dita:mapPath
yor1538171534879.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2302
Product Category
Software
  1. On the Tree Scoring dialog box, click INPUT.
  2. Click analysis parameters.
    Tree Scoring > Input > Analysis Parameters

  3. On this screen, select:
    • Scoring Method
      • Score — Option to create a score table only.
      • Evaluate — Option to perform model evaluation only. Not available for Decision Tree models built using the Regression Trees option.
      • Evaluate and Score — Option to create a score table and perform model evaluation. Not available for Decision Tree models built using the Regression Trees option.
    • Scoring Options
      • Use Dependent variable for predicted value column name — Option to use the exact same column name as the dependent variable when the model is scored. This is the default option.
      • Predicted Value Column Name — If above option is not checked, then enter here the name of the column in the score table which contains the estimated value of the dependent variable.
      • Include Confidence Factor — If this option is checked then the confidence factor will be added to the output table. The Confidence Factor is a measure of how “confident” the model is that it can predict the correct score for a record that falls into a particular leaf node based on the training data the model was built from.

        Example: If a leaf node contained 10 observations and 9 of them predict Buy and the other record predicts Do Not Buy, then the model built will have a confidence factor of .9, or be 90% sure of predicting the right value for a record that falls into that leaf node of the model.

        If the Include validation table option was selected when the decision tree model was built, additional information is provided in the scored table and/or results depending on the scoring option selected. If Score Only is selected, a recalculated confidence factor based on the original validation table is included in the scored output table. If Evaluate Only is selected, a confusion matrix based on the selected table to score is added to the results. If Evaluate and Score is selected, then a confusion matrix based on the selected table to score is added to the results and a recalculated confidence factor based on the selected table to score is included in the scored output table.

      • Targeted Confidence (Binary Outcome Only) — Models built with a predicted variable that has only 2 outcomes can add a targeted confidence value to the output table. The outcomes of the above example were 9 Buys and 1 Do Not Buy at that particular node and if the target value was set to Buy, .9 is the targeted confidence. However if it is desired to target the Do Not Buy outcome by setting the value to Do Not Buy, then any record falling into this leaf of the tree would get a targeted confidence of .1 or 10%.
        If the Include validation table option was selected when the decision tree model was built, additional information is provided in a manner similar to that for the Include Confidence Factor option described above.
        • Targeted Value — The value for the binary targeted confidence.

          Include Confidence Factor and Targeted Confidence are mutually exclusive options, so that only one of the two may be selected.
      • Create Profiling Tables — If this option is selected, additional tables are created to profile the leaf nodes in the tree and to link scored rows to the leaf nodes that they correspond to. To do this, a node ID field is added to the scored output table and two additional tables are built to describe the leaf nodes. One table contains confidence factor or targeted confidence (if requested) and prediction information (named by appending “_1” to the scored output table name), and the other contains the rules corresponding to each leaf node (named by appending “_2” to the scored output table name).
        Selection of the option to Create Profiling Tables is ignored if the Evaluate scoring method or the output option to Generate the SQL for this analysis but do not execute it is selected. It is also ignored if the analysis is being refreshed by a Refresh analysis that requests the creation of a stored procedure.