5.4.5 - Decision Trees and NULL Values - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty

NULL values are handled by listwise deletion. This means that if there are NULL values in any variables (independent and dependent) then that row where a NULL exists will be removed from the model building process.

NULL values in scoring, however, are handled differently. Unlike in tree building where listwise deletion is used, scoring can sometimes handle rows that have NULL values in some of the independent variables. The only time a row will not get scored is if a decision node that the row is being tested on has a NULL value for that decision. For instance, if the first split in a tree is “age < 50,” only rows that don’t have a NULL value for age will pass down further in the tree. This row could have a NULL value in the income variable. But since this decision is on age, the NULL will have no impact at this split and the row will continue down the branches until a leaf is reached or it has a NULL value in a variable used in another decision node.