TextChunker Example 1: POSTagger Output as Input - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

  • Input table: pos_tmp, created by inputting the table cities to the POSTagger function
cities
paraid paratext
1 I live in Los Angeles.
2 New York is a great city.
3 Chicago is a lot of fun, but the winters are very cold and windy.
4 Philadelphia and Boston have many historical sites.

This statement creates pos_tmp:

CREATE multiset table pos_tmp AS (
  SELECT * FROM POSTagger (
    ON cities
    USING
    Accumulate ('paraid')
    TextColumn ('paratext')
  ) AS dt1
) WITH DATA;

SQL Call

SELECT * FROM TextChunker (
  ON pos_tmp PARTITION BY paraid ORDER BY paraid, word_sn
  USING
  WordColumn ('word')
  POSColumn ('pos_tag')
) AS dt;

Output

partition_key chunk_sn chunk chunk_tag
1 1 I NP
1 2 live VP
1 3 in PP
1 4 Los Angeles NP
1 5 . O
3 1 Chicago NP
3 2 is VP
3 3 a lot NP
3 4 of PP
3 5 fun NP
3 6 , O
3 7 but O
3 8 the winters NP
3 9 are VP
3 10 very cold and windy NP
3 11 . O
2 1 New York NP
2 2 is VP
2 3 a great city NP
2 4 . O
4 1 Philadelphia and Boston NP
4 2 have VP
4 3 many historical sites NP
4 4 . O