Teradata Package for Python Function Reference - DecisionTreeEvaluator - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2021-11-19

lifecycle

Product Category

Teradata Vantage

DecisionTreeEvaluator

Functions
		DecisionTreeEvaluator(data, model, index_columns=None, response_column=None, accumulate=None) DESCRIPTION: The function creates confusion matrix as XML output string, displaying counts of predicted versus actual values of the dependent variable of the decision tree model. It also contains counts of correct and incorrect predictions. The function also generates two profile DataFrames containing the details about the decisions made during the prediction. PARAMETERS: data: Required Argument. Specifies the input data containing the columns to analyse, representing the dependent and independent variables in the analysis. Types: teradataml DataFrame model: Required Argument. Specifies the teradataml DataFrame generated by VALIB DecisionTree() function, containing the decision tree model in PMML format that is used to predict the data. Types: teradataml DataFrame index_columns: Optional Argument. Specifies one or more different columns for the primary index of the result output DataFrame. By default, the primary index columns of the result output DataFrame are the primary index columns of the input DataFrame "data". In addition, the columns specified in this argument need to form a unique key for the result output DataFrame. Otherwise, there are more than one score for a given observation. Types: str OR list of Strings (str) response_column: Optional Argument. Specifies the name of the predicted value column. If this argument is not specified, the name of the dependent column in "data" DataFrame is used. Types: str accumulate: Optional Argument. Specifies one or more columns from the "data" DataFrame that can be passed to the result output DataFrame. Types: str OR list of Strings (str) RETURNS: An instance of DecisionTreeEvaluator. Output teradataml DataFrames can be accessed using attribute references, such as DecisionTreeEvalObj.<attribute_name>. Output teradataml DataFrame attribute names are 1. result 2. profile_result_1 3. profile_result_2 RAISES: TeradataMlException, TypeError, ValueError EXAMPLES: # Notes: # 1. To execute Vantage Analytic Library functions, # a. import "valib" object from teradataml. # b. set 'configure.val_install_location' to the database name where Vantage # analytic library functions are installed. # 2. Datasets used in these examples can be loaded using Vantage Analytic Library # installer. # Import valib object from teradataml to execute this function. from teradataml import valib # Set the 'configure.val_install_location' variable. from teradataml import configure configure.val_install_location = "SYSLIB" # Create the required teradataml DataFrame. df = DataFrame("customer_analysis") print(df) # Run DecisionTree() on columns "age", "income" and "nbr_children", with dependent # variable "gender". dt_obj = valib.DecisionTree(data=df, columns=["age", "income", "nbr_children"], response_column="gender", algorithm="gainratio", binning=False, max_depth=5, num_splits=2, pruning="gainratio") # Evaluate the decision tree model generated above. obj = valib.DecisionTreeEvaluator(data=df, model=dt_obj.result, accumulate=["city_name", "state_code"]) # Print the results. print(obj.result) print(obj.profile_result_1) print(obj.profile_result_2)