Accumulated Columns Impact on Function Performance | Teradata Vantage - Accumulated Columns Impact on Function Performance - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905
Consider the following points if the functions display the accumulated columns as the first columns of the output:
  • The data type of the first column cannot be BLOB or CLOB.
  • The first column becomes the Primary Index and must be selected carefully.
  • The Primary Index column (by default, the first column) affects the data distribution and performance. The column with more unique values must be the Primary Index column and vice versa. If the first column in the output is the accumulated column, then you must select the Primary Index column explicitly (if the default column is not optimal) for better performance.
The following functions add the accumulated columns at the end of the output:
  • NaiveBayesPredict
  • DecisionTreePredict
  • Pack
  • Unpack
  • TD_KMeansPredict
  • TD_Silhouette
The following functions add the accumulated columns at the beginning of the output:
  • GLMPredict
  • DecisionForestPredict
  • StringSimilarity
  • TD_Ngramsplitter
  • TD_GetRowsWithMissingValues
  • TD_GetRowsWithoutMissingValues
  • TD_ConvertTo
  • TD_qqnorm
  • TD_TextParser
  • TD_NumApply
  • TD_StrApply
  • TD_RoundColumns
  • TD_BincodeTransform
  • TD_NonLinearCombineTransform
  • TD_OrdinalEncodingTransform
  • TD_PolynomialFeaturesTransform
  • TD_RowNormalizeTransform
  • TD_ScaleTransform
  • TD_RandomProjectionTransform
  • TD_SentimentExtractor