Input - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software
The function has these input tables:
  • Input table
  • Dictionary table [Optional]
TextTokenizer Input Table Schema
Column Name Data Type Description
text_column VARCHAR Text to tokenize.
accumulate_column Any Column to copy to the output table.
TextTokenizer Dictionary Table Schema
Column Name Data Type Description
entry VARCHAR Dictionary entry.

The following table describes the format of both the dictionary table (dict) and the user dictionary file (specified by the UserDictionaryFile argument).

TextTokenizer Dictionary Table and User Dictionary File Format
Language Format
Chinese and English One dictionary word on each line.
Japanese A dictionary entry consists of the following comma-separated words:

word—The original word.

tokenized_word—The tokenized form of the word.

reading—The reading of word in Katakana.

pos—The part-of-speech of the word.

For example:

成田空港,成田空港,ナリタクウコウ,カスタム名詞