HMMDecoder Example 3: Part-of-Speech Tagging

HMMDecoder Example 3: Part-of-Speech Tagging - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product

Teradata Vantage

Release Number

8.00

1.0

Published

May 2019

Language

English (United States)

Last Update

2019-11-22

dita:mapPath

blj1506016597986.ditamap

dita:ditavalPath

blj1506016597986.ditaval

dita:id

B700-4003

lifecycle

Product Category

Teradata Vantage™

In this example, the parts of speech correspond to the hidden states of the HMM function, A (adjective) and N (noun).

Input

Set of phrases whose parts of speech are unknown
Trained table of initial states, initial
Trained table of state transitions, state_transition
Trained table of emissions, emission

Set of Phrases
model	phrase_id	word
1	1	clown
1	1	crazy
1	1	killer
1	1	problem
1	2	nice
1	2	weather

initial
model	tag	probability
1	A	0.25
1	N	0.75

state_transition
model	from_tag	to_tag	probability
1	A	A	0
1	A	N	1
1	N	A	0.5
1	N	N	0.5

emission
model	tag	word	probability
1	A	clown	0
1	N	clown	0.4
1	A	crazy	1
1	N	crazy	0
1	A	killer	0
1	N	killer	0.3
1	A	problem	0
1	N	problem	0.3

SQL Call

SELECT * FROM HMMDecoder (
  ON initial AS InitStateProb PARTITION BY model ORDER BY model, tag
  ON state_transition as TransProb PARTITION BY model
  ON emission AS EmissionProb PARTITION BY model
  ON phrases AS observation PARTITION BY model
    ORDER BY model, phrase_id ASC
  USING
  InitStateModelColumn ('model')
  InitStateColumn ('tag')
  InitStateProbColumn ('probability')
  TransAttributeColumn ('model')
  TransFromStateColumn ('from_tag')
  TransToStateColumn ('to_tag')
  TransProbColumn ('probability')
  EmitModelColumn ('model')
  EmitStateColumn ('tag')
  EmitObsColumn ('word')
  EmitProbColumn ('probability')
  ModelColumn ('model')
  SeqColumn ('phrase_id')
  ObsColumn ('word')
) AS dt ORDER by 1, 2, 3;

Output

model	phrase_id	word	tag
1	1	clown	N
1	1	crazy	A
1	1	killer	N
1	1	problem	N
1	2	nice	A
1	2	weather	A