Description
The NamedEntityFinder function evaluates the input, identifies tokens based
on the specified model, and outputs the tokens with detailed information.
The function does not identify sentences, it simply tokenizes.
Note: Token identification is not case-sensitive.
Usage
td_namedentity_finder_mle (
newdata = NULL,
configure.table.data = NULL,
text.column = NULL,
model = NULL,
show.entity.context = 0,
entity.column = "entity",
accumulate = NULL,
newdata.sequence.column = NULL,
configure.table.data.sequence.column = NULL,
newdata.order.column = NULL,
configure.table.data.order.column = NULL
)
Arguments
newdata |
Required Argument. |
newdata.order.column |
Optional Argument. |
configure.table.data |
Optional Argument.
|
configure.table.data.order.column |
Optional Argument. |
text.column |
Required Argument. |
model |
Optional if you specify "configure.table.data" and required otherwise (and you cannot
specify "all"). If you specify both "configure.table.data" and this argument,
then the function only uses models from the "configure.table.data" tbl_teradata.
If 'model_type' is "reg exp", then specify a 'regular_expression' that describes the
'entity_type', otherwise, specify 'model_file' (the name of the model file). |
show.entity.context |
Optional Argument. |
entity.column |
Optional Argument. |
accumulate |
Optional Argument. |
newdata.sequence.column |
Optional Argument. |
configure.table.data.sequence.column |
Optional Argument. |
Value
Function returns an object of class "td_namedentity_finder_mle" which is
a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator
using name: output.
Examples
# Get the current context/connection
con <- td_get_context()$connection
# Load example data.
loadExampleData("namedentityfinder_example", "assortedtext_input", "namefind_configure")
# Create object(s) of class "tbl_teradata".
assortedtext_input <- tbl(con, "assortedtext_input")
namefind_configure <- tbl(con, "namefind_configure")
# Example 1: Find entities using a configuration table containing model items.
td_namedentity_finder_out <- td_namedentity_finder_mle(newdata = assortedtext_input,
configure.table.data = namefind_configure,
text.column = "content",
model = "all",
accumulate = c("id", "source")
)
# Example 2: Use a custom trained model to find the entities.
# Load example data.
loadExampleData("namedentityfindertrainer_example", "nermem_sports_train")
# Create object(s) of class "tbl_teradata".
nermem_sports_train <- tbl(con, "nermem_sports_train")
# Train a namedentity finder model on entity type: "LOCATION".
# The trained model is stored in a binary file: "location.sports".
td_neft_out <- td_namedentity_finder_trainer_mle(data = nermem_sports_train,
text.column = "content",
entity.type = "LOCATION",
model.file = "location.sports"
)
# Select a subset of the train dataset to use as "newdata" in td_namedentity_finder_mle()
# function.
nermem_sports_test <- nermem_sports_train %>% filter(id < 20L)
# Use the model file: location.sports as the input model.
td_namedentity_finder_out1 <- td_namedentity_finder_mle(newdata = nermem_sports_test,
text.column = "content",
model = "LOCATION:max entropy:location.sports"
)