Description
The TextChunker function divides text into phrases and assigns each phrase a tag that identifies its type.
Usage
td_text_chunker_mle (
data = NULL,
word.column = NULL,
pos.column = NULL,
data.sequence.column = NULL,
data.partition.column = NULL,
data.order.column = NULL
)
Arguments
data |
Required Argument. |
data.partition.column |
Required Argument. |
data.order.column |
Required Argument. |
word.column |
Required Argument. |
pos.column |
Required Argument. |
data.sequence.column |
Optional Argument. |
Value
Function returns an object of class "td_text_chunker_mle" which is a
named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator
using name: result.
Examples
# Get the current context/connection
con <- td_get_context()$connection
# Load example data.
loadExampleData("text_chunker_example", "posttagger_output")
# Create object(s) of class "tbl_teradata".
posttagger_output <- tbl(con, "posttagger_output")
# Example 1 - This example uses the persisted output of the td_pos_tagger_mle() function
# as input.
text_chunker_out1 <- td_text_chunker_mle(data=posttagger_output,
data.partition.column='paraid',
data.order.column=c('paraid','word_sn'),
word.column='word',
pos.column='pos_tag',
data.sequence.column='paraid')
# Load example data.
loadExampleData("pos_tagger_example", "paragraphs_input")
# Create remote tibble objects.
paragraphs_input <- tbl(con, "paragraphs_input")
# Example 2 - This example uses output of the td_pos_tagger_mle() function as input. The output
# of the td_pos_tagger_mle() function is generated using the td_sentence_extractor_mle()
# function as input.
td_sentence_extractor_out <- td_sentence_extractor_mle(data = paragraphs_input,
text.column = "paratext",
accumulate = "paraid")
sentenceextractor_out <- td_sentence_extractor_out$result
pos_tagger_out <- td_pos_tagger_mle(data=sentenceextractor_out,
text.column='sentence',
accumulate='sentence_sn')
text_chunker_out2 <- td_text_chunker_mle(data=pos_tagger_out$result,
data.partition.column='word_sn',
data.order.column='word_sn',
word.column='word',
pos.column='pos_tag',
data.sequence.column='word_sn')