1.1 - 8.10 - TextClassifierTrainer Example - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

Input

  • InputTable: texttrainer_input
  • Stop words file: stopwords.txt, which is preinstalled on ML Engine
texttrainer_input
id content category
1 Tennis star Roger Federer was born on August 8, 1981, in Basel, Switzerland, to Swiss father Robert Federer and South African mother Lynette Du Rand sports
2 Federer took an interest in sports at an early age, playing tennis and soccer at the age of 8. sports
3 At age 14, Federer became the national junior champion in Switzerland sports
4 Federer won the Wimbledon boys singles and doubles titles in 1998, and turned professional later that year. sports
5 In 2003, following a successful season on grass, Federer became the first Swiss man to win a Grand Slam title when he emerged victorious at Wimbledon. sports
6 A natural disaster is a major adverse event resulting from natural processes of the Earth. Examples include floods, volcanic eruptions, earthquakes, tsunamis, and other geologic processes. natural disaster
7 In a vulnerable area, however, such as San Francisco in 1906, an earthquake can have disastrous consequences and leave lasting damage, requiring years to repair. natural disaster
8 An earthquake is the result of a sudden release of energy in the Earth crust that creates seismic waves. natural disaster
9 Volcanoes can cause widespread destruction and consequent disaster in several ways. natural disaster
10 A flood is an overflow of water that submerges land natural disaster
stopwords.txt
a
an
in
is
to
into
was
the
and
this
with
they
but
will

SQL Call

SELECT * FROM TextClassifierTrainer (
  ON texttrainer_input AS InputTable
  USING
  TextColumn ('content')
  CategoryColumn ('category')
  OutputModelFile ('knn.bin')
  
  ModelType ('knn')
  KNNModelParameters ('compress:0.9')
  NLPParameters ('useStem:true', 'stopwordsFile: stopwords.txt')
  FeatureSelectionLimits ('DF:[0.1:0.99]')
) AS dt;

Output

 train_result                 
 ---------------------------- 
 Model generated.            
 Training time(s): 0.066     
 File name: knn.bin          
 File size(KB): 1            
 Model successfully installed

The model file, knn.bin, is in binary format.

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.