HMMDecoder Example 3: Part-of-Speech Tagging - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

In this example, the parts of speech correspond to the hidden states of the HMM function, A (adjective) and N (noun).

Input

  • Set of phrases whose parts of speech are unknown
  • Trained table of initial states, initial
  • Trained table of state transitions, state_transition
  • Trained table of emissions, emission
Set of Phrases
model phrase_id word
1 1 clown
1 1 crazy
1 1 killer
1 1 problem
1 2 nice
1 2 weather
initial
model tag probability
1 A 0.25
1 N 0.75
state_transition
model from_tag to_tag probability
1 A A 0
1 A N 1
1 N A 0.5
1 N N 0.5
emission
model tag word probability
1 A clown 0
1 N clown 0.4
1 A crazy 1
1 N crazy 0
1 A killer 0
1 N killer 0.3
1 A problem 0
1 N problem 0.3

SQL Call

SELECT * FROM HMMDecoder (
  ON initial AS InitStateProb PARTITION BY model ORDER BY model, tag
  ON state_transition as TransProb PARTITION BY model
  ON emission AS EmissionProb PARTITION BY model
  ON phrases AS observation PARTITION BY model
    ORDER BY model, phrase_id ASC
  USING
  InitStateModelColumn ('model')
  InitStateColumn ('tag')
  InitStateProbColumn ('probability')
  TransAttributeColumn ('model')
  TransFromStateColumn ('from_tag')
  TransToStateColumn ('to_tag')
  TransProbColumn ('probability')
  EmitModelColumn ('model')
  EmitStateColumn ('tag')
  EmitObsColumn ('word')
  EmitProbColumn ('probability')
  ModelColumn ('model')
  SeqColumn ('phrase_id')
  ObsColumn ('word')
) AS dt ORDER by 1, 2, 3;

Output

model phrase_id word tag
1 1 clown N
1 1 crazy A
1 1 killer N
1 1 problem N
1 2 nice A
1 2 weather A