What does a decision tree model look like? It first of all has a root node, which is associated with all of the data in the training set used to build the tree. Each node in the tree is either a decision node or a leaf node, which has no further connected nodes. A decision node represents a split in the data based on the values of a single input or predictor variable. A leaf node represents a subset of the data that has a particular value of the predicted variable (i.e., the resulting class of the predicted variable). A measure of accuracy is also associated with the leaf nodes of the tree.
The first issue in building a tree is the decision as to how data should be split at each decision node in the tree. The second issue is when to stop splitting each decision node and make it a leaf. And finally, what class should be assigned to each leaf node. In practice, researchers have found that it is usually best to let a tree grow as big as it needs to and then prune it back at the end to reduce its complexity and increase its interpretability.
Once a decision tree model is built it can be used to score or classify new data. If the new data includes the values of the predicted variable it can be used to measure the effectiveness of the model. Typically though scoring is performed in order to create a new table containing key fields and the predicted value or class identifier.