| |
Methods defined here:
- __init__(self, data=None, model_id_column=None, probability_column=None, observation_column=None, positive_class=None, num_thresholds=50, data_sequence_column=None)
- DESCRIPTION:
A receiver operating characteristic (ROC) curve shows the performance
of a binary classification model as its discrimination threshold
varies. For a range of thresholds, the curve plots the true positive
rate against the false positive rate.
Note:
This function is available only when teradataml is connected to
Vantage 1.1 or later versions.
PARAMETERS:
data:
Required Argument.
Specifies a teradataml DataFrame that contains the
prediction-actual pairs for a binary classifier.
model_id_column:
Optional Argument.
Specifies the input teradataml DataFrame column that
contains the model or partition identifiers for the ROC curves.
Use this argument only when input teradataml DataFrame contains
information for more than one model. The function creates a separate
ROC curve for each model identifier in this column. Each model must
include exactly two classes in observation_column.
Types: str
probability_column:
Required Argument.
Specifies the input teradataml DataFrame column that
contains the predictions.
Types: str
observation_column:
Required Argument.
Specifies the input teradataml DataFrame column that
contains the actual classes.
Types: str
positive_class:
Required Argument.
Specifies the label of the positive class.
Types: str
num_thresholds:
Optional Argument.
Specifies the number of thresholds for the function to use. The
num_threshold must be a Integer value in the range [1, 10000]. The
function uniformly distributes the thresholds between 0 and 1.
Default Value: 50
Types: int
data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
RETURNS:
Instance of ROC.
Output teradataml DataFrames can be accessed using attribute
references, such as ROCObj.<attribute_name>.
Output teradataml DataFrame attribute names are:
1. roc_output
2. output
Note:
1. Function will return auc and gini values in output teradataml DataFrame.
2. Function will return roc values (thresholds, false positive rates, and
true positive rates) in roc_output teradataml DataFrame.
RAISES:
TeradataMlException
EXAMPLES:
# Load the data to run the example.
load_example_data("ROC", "roc_input")
# Create teradataml DataFrame.
roc_input = DataFrame.from_table("roc_input")
# Example : Running ROC function with default values.
# It will return the result DataFrame roc_output and output.
roc_out1 = ROC(data=roc_input,
probability_column='probability',
observation_column='observation',
model_id_column='model_id',
positive_class='1',
num_thresholds=100
)
# Print the result DataFrame.
print(roc_out1.roc_output)
print(roc_out1.output)
- __repr__(self)
- Returns the string representation for a ROC class instance.
- get_build_time(self)
- Function to return the build time of the algorithm in seconds.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- get_prediction_type(self)
- Function to return the Prediction type of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- get_target_column(self)
- Function to return the Target Column of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
- show_query(self)
- Function to return the underlying SQL query.
When model object is created using retrieve_model(), then None is returned.
|