Teradata Scoring SDK Text Tokenizer (ML Engine) - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

This is the Teradata Scoring SDK version of the function TextTokenizer (ML Engine).

To run queries that include non-Latin characters, you must set SESSION CHARSET to UTF-8. For more information, see Basic Teradata® Query Reference, B035-2414.

Model Format

Syntax Element Description
ModelType ttoken, text tokenizer
ModelTable Database table
ModelTag DICT
InstalledFile Dictionary or CRF model file
InstalledFileTag DICT, CRF

Request Definition

Same as TextTokenizer Input.

Parameters

Parameter Supported Comments
TextColumn Yes  
InputLanguage or Language Yes  
OutputDelimiter Yes  
OutputByWord Yes  
Accumulate Yes  
ON input_table No Provided using Request.
On dict No Populated in AML file.
ModelFile No Populated in AML file.
UserDictionaryFile No Populated in AML file.