1.1 - 8.10 - Teradata Scoring SDK Text Tokenizer (ML Engine) - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Teradata Vantage
Release Number
October 2019
Content Type
Programming Reference
Publication ID
English (United States)

This is the Teradata Scoring SDK version of the function TextTokenizer (ML Engine).

To run queries that include non-Latin characters, you must set SESSION CHARSET to UTF-8. For more information, see Basic Teradata® Query Reference, B035-2414.

Model Format

Syntax Element Description
ModelType ttoken, text tokenizer
ModelTable Database table
ModelTag DICT
InstalledFile Dictionary or CRF model file
InstalledFileTag DICT, CRF

Request Definition

Same as TextTokenizer Input.


Parameter Supported Comments
TextColumn Yes  
InputLanguage or Language Yes  
OutputDelimiter Yes  
OutputByWord Yes  
Accumulate Yes  
ON input_table No Provided using Request.
On dict No Populated in AML file.
ModelFile No Populated in AML file.
UserDictionaryFile No Populated in AML file.