| |
Methods defined here:
- __init__(self, object=None, obs_column=None, predict_column=None, object_sequence_column=None, object_order_column=None)
- DESCRIPTION:
The TextClassifierEvaluator function evaluates the precision, recall,
and F-measure of the trained model output by the function
TextClassifier.
PARAMETERS:
object:
Required Argument.
Specifies the teradataml DataFrame containing the model data from
TextClassifier or instance of 'TextClassifier'.
object_order_column:
Optional Argument.
Specifies Order By columns for object.
Values to this argument can be provided as list, if multiple columns
are used for ordering.
Types: str OR list of Strings (str)
obs_column:
Required Argument.
Specifies the name of the input teradataml DataFrame column that
contains the expected (correct) category.
Types: str
predict_column:
Required Argument.
Specifies the name of the input teradataml DataFrame column that
contains the predicted category.
Types: str
object_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "object". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
RETURNS:
Instance of TextClassifierEvaluator.
Output teradataml DataFrames can be accessed using attribute
references, such as TextClassifierEvaluatorObj.<attribute_name>.
Output teradataml DataFrame attribute name is:
result
RAISES:
TeradataMlException
EXAMPLES:
# Load example data.
load_example_data("textclassifiertrainer", "texttrainer_input")
load_example_data("textclassifier", "textclassifier_input")
# Create teradataml DataFrame objects.
# The input table "texttrainer_input" contains text of the training
# documents and the category of the training documents.
texttrainer_input = DataFrame.from_table("texttrainer_input")
# The input table "textclassifier_input" contains the text to be
# classified.
textclassifier_input = DataFrame.from_table("textclassifier_input")
# The model file "knn.bin" generated by TextClassifierTrainer function
# is used by TextClassifier to classify the input text.
textclassifiertrainer_out = TextClassifierTrainer(data=texttrainer_input,
text_column='content',
category_column='category',
classifier_type='knn',
model_file='knn.bin',
nlp_parameters=['useStem:true','stopwordsFile:stopwords.txt'],
classifier_parameters='compress:0.9',
feature_selection='DF:[0.1:0.99]',
data_sequence_column='id'
)
textclassifier_out = TextClassifier(newdata = textclassifier_input,
model_file = "knn.bin",
text_column = "content",
accumulate = ["id","category"],
newdata_order_column = "id"
)
# Example 1 - TextClassifierEvaluator uses the output of TextClassifier.
textclassifierevaluator_out1 = TextClassifierEvaluator(object=textclassifier_out,
obs_column='category',
predict_column='out_category',
object_sequence_column='id',
object_order_column='id'
)
# Print the result teradataml DataFrame
print(textclassifierevaluator_out1)
# Example 2 - Alternatively, persist and use the output table of TextClassifier.
copy_to_sql(textclassifier_out.result, "textclassifier_output")
textclassifier_output = DataFrame.from_table("textclassifier_output")
textclassifierevaluator_out2 = TextClassifierEvaluator(object=textclassifier_output,
obs_column='category',
predict_column='out_category',
object_sequence_column='id',
object_order_column='id'
)
# Print the results.
print(textclassifierevaluator_out2.result)
- __repr__(self)
- Returns the string representation for a TextClassifierEvaluator class instance.
|