Teradata Package for Python Function Reference | 17.10 - TextClassifierEvaluator - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product
Teradata Package for Python
Release Number
17.10
Published
April 2022
Language
English (United States)
Last Update
2022-08-19
lifecycle
previous
Product Category
Teradata Vantage

 
teradataml.analytics.mle.TextClassifierEvaluator = class TextClassifierEvaluator(builtins.object)
     Methods defined here:
__init__(self, object=None, obs_column=None, predict_column=None, object_sequence_column=None, object_order_column=None)
DESCRIPTION:
    The TextClassifierEvaluator function evaluates the precision, recall,
    and F-measure of the trained model output by the function
    TextClassifier.
 
    Note :
        The TextClassifierEvaluator function is deprecated in Vantage 1.3 or later versions.
        Though it is available for use, Teradata recommends to use the FMeasure function instead.
 
PARAMETERS:
    object:
        Required Argument.
        Specifies the teradataml DataFrame containing the model data from
        TextClassifier or instance of 'TextClassifier'.
 
    object_order_column:
        Optional Argument.
        Specifies Order By columns for object.
        Values to this argument can be provided as list, if multiple columns
        are used for ordering.
        Types: str OR list of Strings (str)
 
    obs_column:
        Required Argument.
        Specifies the name of the input teradataml DataFrame column that
        contains the expected (correct) category.
        Types: str
 
    predict_column:
        Required Argument.
        Specifies the name of the input teradataml DataFrame column that
        contains the predicted category.
        Types: str
 
    object_sequence_column:
        Optional Argument.
        Specifies the list of column(s) that uniquely identifies each row of
        the input argument "object". The argument is used to ensure
        deterministic results for functions which produce results that vary
        from run to run.
        Types: str OR list of Strings (str)
 
RETURNS:
    Instance of TextClassifierEvaluator.
    Output teradataml DataFrames can be accessed using attribute
    references, such as TextClassifierEvaluatorObj.<attribute_name>.
    Output teradataml DataFrame attribute name is:
        result
 
 
RAISES:
    TeradataMlException
 
 
EXAMPLES:
    # Load example data.
    load_example_data("textclassifiertrainer", "texttrainer_input")
    load_example_data("textclassifier", "textclassifier_input")
 
    # Create teradataml DataFrame objects.
    # The input table "texttrainer_input" contains text of the training
    # documents and the category of the training documents.
    texttrainer_input = DataFrame.from_table("texttrainer_input")
 
    # The input table "textclassifier_input" contains the text to be
    # classified.
    textclassifier_input = DataFrame.from_table("textclassifier_input")
 
    # The model file "knn.bin" generated by TextClassifierTrainer function
    # is used by TextClassifier to classify the input text.
    textclassifiertrainer_out = TextClassifierTrainer(data=texttrainer_input,
                                                      text_column='content',
                                                      category_column='category',
                                                      classifier_type='knn',
                                                      model_file='knn.bin',
                                                      nlp_parameters=['useStem:true','stopwordsFile:stopwords.txt'],
                                                      classifier_parameters='compress:0.9',
                                                      feature_selection='DF:[0.1:0.99]',
                                                      data_sequence_column='id'
                                                      )
 
    textclassifier_out = TextClassifier(newdata = textclassifier_input,
                                        model_file = "knn.bin",
                                        text_column = "content",
                                        accumulate = ["id","category"],
                                        newdata_order_column = "id"
                                        )
 
    # Example 1 - TextClassifierEvaluator uses the output of TextClassifier.
    textclassifierevaluator_out1 = TextClassifierEvaluator(object=textclassifier_out,
                                     obs_column='category',
                                     predict_column='out_category',
                                     object_sequence_column='id',
                                     object_order_column='id'
                                     )
 
    # Print the result teradataml DataFrame
    print(textclassifierevaluator_out1)
 
 
    # Example 2 - Alternatively, persist and use the output table of TextClassifier.
    copy_to_sql(textclassifier_out.result, "textclassifier_output")
    textclassifier_output =  DataFrame.from_table("textclassifier_output")
 
    textclassifierevaluator_out2 = TextClassifierEvaluator(object=textclassifier_output,
                                     obs_column='category',
                                     predict_column='out_category',
                                     object_sequence_column='id',
                                     object_order_column='id'
                                     )
 
    # Print the results.
    print(textclassifierevaluator_out2.result)
__repr__(self)
Returns the string representation for a TextClassifierEvaluator class instance.
get_build_time(self)
Function to return the build time of the algorithm in seconds.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
get_prediction_type(self)
Function to return the Prediction type of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
get_target_column(self)
Function to return the Target Column of the algorithm.
When model object is created using retrieve_model(), then the value returned is
as saved in the Model Catalog.
show_query(self)
Function to return the underlying SQL query.
When model object is created using retrieve_model(), then None is returned.