7.00.02 - TF_IDF - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
Last Update
2018-04-17

The TF_IDF function can do either of the following:

  • Take any document set and output the inverse document frequency (IDF) and term frequency- inverse document frequency (TF-IDF) scores for each term.
  • Use the output of a previous run of the TF_IDF function on a training document set to predict TF_IDF scores of an input (test) document set.
You can use the TF-IDF scores as input for many document clustering and classification algorithms, including:
  • Cosine-similarity
  • Latent Dirichlet allocation
  • K-means clustering
  • K-nearest neighbors

You can use the TF-IDF scores derived from a training document set to generate a model in a classification function (for example, SparseSVMTrainer) and then use the resulting TF-IDF scores in a classification prediction function (for example, SparseSVMPredictor).