Optional Syntax Elements for TD_WordEmbeddings - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905
SecondaryColumn
Name of the input table column that contains the text. This field is applicable for the token2token-similarity and doc2doc-similarity operations only.
Accumulate
List of columns to be added to the output from the input table. This is not applicable with the token-embedding operation.
Operation
Operation to be performed on the data. Options are:
  • token-embedding: Emits vectors to all tokens in the column.
  • doc-embedding: Vectorizes each token in the document and combines them.
  • token2token-similarity: Computes the similarity between tokens and quantifies the result value.
  • doc2doc-similarity: Computes the similarity between documents and quantifies the result value.
Default value: token-embedding
RemoveStopWords
All stop words present in the input table text are removed before any operation is performed. Applicable to all operations except token2token-similarity. Default is False.
ConvertToLowerCase
All operations are performed after converting input table text to lowercase letters. Default is True.
StemTokens
Converts word to its root word in the input table, such as converting going to go. Default is False.