This is the Teradata Scoring SDK version of the function TextTokenizer (ML Engine).
To run queries that include non-Latin characters, you must set SESSION CHARSET to UTF-8. For more information, see Basic Teradata® Query Reference, B035-2414.
Model Format
Syntax Element | Description |
---|---|
ModelType | ttoken, text tokenizer |
ModelTable | Database table |
ModelTag | DICT |
InstalledFile | Dictionary or CRF model file |
InstalledFileTag | DICT, CRF |
Request Definition
Same as TextTokenizer Input.
Parameters
Parameter | Supported | Comments |
---|---|---|
TextColumn | Yes | |
InputLanguage or Language | Yes | |
OutputDelimiter | Yes | |
OutputByWord | Yes | |
Accumulate | Yes | |
ON input_table | No | Provided using Request. |
On dict | No | Populated in AML file. |
ModelFile | No | Populated in AML file. |
UserDictionaryFile | No | Populated in AML file. |