Description
The POSTagger function creates part-of-speech (POS) tags for the words in the input text. POS tagging is the first step in the syntactic analysis of a language, and an important preprocessing step in many natural language processing applications.
Usage
td_pos_tagger_mle ( data = NULL, text.column = NULL, language = "en", accumulate = NULL, data.sequence.column = NULL, data.order.column = NULL )
Arguments
data |
Required Argument. |
data.order.column |
Optional Argument. |
text.column |
Required Argument. |
language |
Optional Argument. |
accumulate |
Optional Argument. |
data.sequence.column |
Optional Argument. |
Value
Function returns an object of class "td_pos_tagger_mle" which is a
named list containing Teradata tbl object.
Named list member can be referenced directly with the "$" operator
using name: result.
Examples
# Get the current context/connection con <- td_get_context()$connection # Load example data. loadExampleData("pos_tagger_example", "paragraphs_input") # Create remote tibble objects. paragraphs_input <- tbl(con, "paragraphs_input") # Example 1 - Applying POSTagger using default language 'en'. pos_tagger_out <- td_pos_tagger_mle(data=paragraphs_input, text.column='paratext', language='en', accumulate='paraid') # Example 2 - This example uses output of SentenceExtractor as Input. td_sentence_extractor_out <- td_sentence_extractor_mle(data = paragraphs_input, text.column = "paratext", accumulate = c("paraid", "paratopic")) pos_tagger_out <- td_pos_tagger_mle(data=td_sentence_extractor_out$result, text.column='sentence', accumulate=c('sentence','sentence_sn'))