Teradata Package for Python Function Reference | 17.10 - SentimentExtractor - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.10

Published

April 2022

Language

English (United States)

Last Update

2022-08-19

lifecycle

Product Category

Teradata Vantage

teradataml.analytics.mle.SentimentExtractor = class SentimentExtractor(builtins.object)

Methods defined here:

__init__(self, object=None, newdata=None, dict_data=None, text_column=None, language='en', level='DOCUMENT', high_priority='NONE', filter='ALL', accumulate=None, newdata_sequence_column=None, dict_data_sequence_column=None, newdata_order_column=None, dict_data_order_column=None): DESCRIPTION: The SentimentExtractor function extracts the sentiment (positive, negative, or neutral) of each input document or sentence, using either a classification model output by the function SentimentTrainer or a dictionary model. PARAMETERS: object: Optional Argument. Specifies the model type and file. The default model type is dictionary. If you omit this argument or specify dictionary without dictionary file, then you must specify a dictionary teradataml DataFrame with the name dict_data. If you specify both dict and dictionary file, then whenever their words conflict, dict has higher priority. The dictionary file must be a text file in which each line contains only a sentiment word, a space, and the opinion score of the sentiment word. If you specify classification:model_file, model_file must be the name of a model file generated and installed on the database by the function SentimentTrainer. Note: Before running the function, add the location of dictionary file or model_file to the user/session default search path. Types: str newdata: Required Argument. Specifies the teradataml DataFrame defining the input text. newdata_order_column: Optional Argument. Specifies Order By columns for newdata. Values to this argument can be provided as list, if multiple columns are used for ordering. Types: str OR list of Strings (str) dict_data: Optional Argument. Specifies the teradataml DataFrame defining the dictionary. dict_data_order_column: Optional Argument. Specifies Order By columns for dict_data. Values to this argument can be provided as list, if multiple columns are used for ordering. Types: str OR list of Strings (str) text_column: Required Argument. Specifies the name of the input column that contains text from which to extract sentiments. Types: str language: Optional Argument. Specifies the language of the input text: - en (English) - zh_CN (Simplified Chinese) - zh_TW (Traditional Chinese) Default Value: "en" Permitted Values: en, zh_CN, zh_TW Types: str level: Optional Argument. Specifies the level of analysis — whether to analyze each document or each sentence. Default Value: "DOCUMENT" Permitted Values: DOCUMENT, SENTENCE Types: str high_priority: Optional Argument. Specifies the highest priority when returning results: - NEGATIVE_RECALL: Give highest priority to negative results, including those with lower confidence sentiment classifications (maximizes the number of negative results returned). - NEGATIVE_PRECISION: Give highest priority to negative results with high-confidence sentiment classifications. - POSITIVE_RECALL: Give highest priority to positive results, including those with lower confidence sentiment classifications (maximizes the number of positive results returned). - POSITIVE_PRECISION: Give highest priority to positive results with high-confidence sentiment classifications. NONE: Give all results the same priority. Default Value: "NONE" Permitted Values: NEGATIVE_RECALL, NEGATIVE_PRECISION, POSITIVE_RECALL, POSITIVE_PRECISION, NONE Types: str filter: Optional Argument. Specifies the kind of results to return: - POSITIVE: Return only results with positive sentiments. - NEGATIVE: Return only results with negative sentiments. - ALL: Return all results. Default Value: "ALL" Permitted Values: POSITIVE, NEGATIVE, ALL Types: str accumulate: Optional Argument. Specifies the names of the input columns to copy to the output teradataml DataFrame. Types: str OR list of Strings (str) newdata_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "newdata". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) dict_data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "dict_data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) RETURNS: Instance of SentimentExtractor. Output teradataml DataFrames can be accessed using attribute references, such as SentimentExtractorObj.<attribute_name>. Output teradataml DataFrame attribute name is: result RAISES: TeradataMlException EXAMPLES: # Load example data. load_example_data("sentimenttrainer", "sentiment_train") load_example_data("sentimentextractor", ["sentiment_extract_input", "sentiment_word"]) # Create teradataml DataFrame objects. sentiment_train = DataFrame.from_table("sentiment_train") sentiment_extract_input = DataFrame.from_table("sentiment_extract_input") sentiment_word = DataFrame.from_table("sentiment_word") # Example 1 - This example uses the dictionary model file to analyze each document. SentimentExtractor_out1 = SentimentExtractor(object = "dictionary", newdata = sentiment_extract_input, text_column = "review", level = "document", accumulate = ["id","product"] ) # Print the results print(SentimentExtractor_out1) # Example 2 - This example uses the dictionary model file to analyze each sentence. SentimentExtractor_out2 = SentimentExtractor(object = "dictionary", newdata = sentiment_extract_input, text_column = "review", level = "sentence", accumulate = ["id","product"] ) # Print the results print(SentimentExtractor_out2) # Example 3 - This example uses a maximum entropy classification model file. SentimentExtractor_out3 = SentimentExtractor(object = "classification:default_sentiment_classification_model.bin", newdata = sentiment_extract_input, text_column = "review", level = "document", accumulate = ["id"] ) # Print the results print(SentimentExtractor_out3) # Example 4 - This example uses a model file output by the SentimentTrainer function. SentimentTrainer_out = SentimentTrainer(data = sentiment_train, text_column = "review", sentiment_column = "category", model_file = "sentimentmodel1.bin" ) SentimentExtractor_out4 = SentimentExtractor(object = "classification:sentimentmodel1.bin", newdata = sentiment_extract_input, text_column = "review", level = "document", accumulate = ["id"] ) # Print the results print(SentimentExtractor_out4) # Example 5 - This example uses a dictionary instead of a model file. SentimentExtractor_out5 = SentimentExtractor(dict_data = sentiment_word, newdata = sentiment_extract_input, text_column = "review", level = "document", accumulate = ["id", "product"] ) # Print the results print(SentimentExtractor_out5)

__repr__(self): Returns the string representation for a SentimentExtractor class instance.

get_build_time(self): Function to return the build time of the algorithm in seconds. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_prediction_type(self): Function to return the Prediction type of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_target_column(self): Function to return the Target Column of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

show_query(self): Function to return the underlying SQL query. When model object is created using retrieve_model(), then None is returned.