1.1 - 8.10 - Teradata Scoring SDK Text Tokenizer (ML Engine) - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

This is the Teradata Scoring SDK version of the function TextTokenizer (ML Engine).

To run queries that include non-Latin characters, you must set SESSION CHARSET to UTF-8. For more information, see Basic Teradata® Query Reference, B035-2414.

Model Format

Syntax Element Description
ModelType ttoken, text tokenizer
ModelTable Database table
ModelTag DICT
InstalledFile Dictionary or CRF model file
InstalledFileTag DICT, CRF

Request Definition

Same as TextTokenizer Input.

Parameters

Parameter Supported Comments
TextColumn Yes  
InputLanguage or Language Yes  
OutputDelimiter Yes  
OutputByWord Yes  
Accumulate Yes  
ON input_table No Provided using Request.
On dict No Populated in AML file.
ModelFile No Populated in AML file.
UserDictionaryFile No Populated in AML file.