Teradata Package for Python Function Reference | 17.10 - POSTagger - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.10

Published

April 2022

Language

English (United States)

Last Update

2022-08-19

lifecycle

Product Category

Teradata Vantage

teradataml.analytics.mle.POSTagger = class POSTagger(builtins.object)

Methods defined here:

__init__(self, data=None, text_column=None, language='en', accumulate=None, data_sequence_column=None, data_order_column=None): DESCRIPTION: The POSTagger function generates part-of-speech (POS) tags for the words in the input text. POS tagging is the first step in the syntactic analysis of a language, and an important preprocessing step in many natural language processing applications. PARAMETERS: data: Required Argument. Specifies the input teradataml DataFrame that contains the input texts to tag. data_order_column: Optional Argument. Specifies Order By columns for data. Values to this argument can be provided as list, if multiple columns are used for ordering. Types: str OR list of Strings (str) text_column: Required Argument. Specifies the name of the input column that contains the text to be tagged. Types: str language: Optional Argument. Specifies the language of the input text. Default Value: en Permitted Values: en (English), zh_CN (Simplified Chinese) Types: str accumulate: Optional Argument. Specifies the names of the input teradataml DataFrame columns to copy to the output teradataml DataFrame. Note: If you intend to use the POSTagger output teradataml DataFrame as input to the function "TextChunker", then this argument must specify the input teradataml DataFrame columns that comprise the partition key. Types: str OR list of Strings (str) data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) RETURNS: Instance of POSTagger. Output teradataml DataFrames can be accessed using attribute references, such as POSTaggerObj.<attribute_name>. Output teradataml DataFrame attribute name is: result RAISES: TeradataMlException EXAMPLES: # Load the data to run the example. load_example_data("postagger","paragraphs_input") # Create input teradataml dataframes paragraphs_input = DataFrame.from_table("paragraphs_input") # Example 1 - Applying POSTagger using default language 'en'. pos_tagger_out = POSTagger(data=paragraphs_input, text_column='paratext', accumulate='paraid') # Print the result DataFrame. print(pos_tagger_out.result)

__repr__(self): Returns the string representation for a POSTagger class instance.

get_build_time(self): Function to return the build time of the algorithm in seconds. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_prediction_type(self): Function to return the Prediction type of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_target_column(self): Function to return the Target Column of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

show_query(self): Function to return the underlying SQL query. When model object is created using retrieve_model(), then None is returned.