Accumulated Columns Impact on Function Performance | Teradata Vantage - Accumulated Columns Impact on Function Performance - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-06
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢
Consider the following points if the functions display the accumulated columns as the first columns of the output:
  • The data type of the first column cannot be BLOB or CLOB.
  • The first column becomes the Primary Index and must be selected carefully.
  • The Primary Index column (by default, the first column) affects the data distribution and performance. The column with more unique values must be the Primary Index column and vice versa. If the first column in the output is the accumulated column, then you must select the Primary Index column explicitly (if the default column is not optimal) for better performance.
The following functions add the accumulated columns at the end of the output:
  • NaiveBayesPredict
  • DecisionTreePredict
  • Pack
  • Unpack
  • TD_KMeansPredict
  • TD_Silhouette
The following functions add the accumulated columns at the beginning of the output:
  • GLMPredict
  • DecisionForestPredict
  • StringSimilarity
  • NgramSplitter
  • TD_GetRowsWithMissingValues
  • TD_GetRowsWithoutMissingValues
  • TD_ConvertTo
  • TD_qqnorm
  • TD_TextParser
  • TD_NumApply
  • TD_StrApply
  • TD_RoundColumns
  • TD_BincodeTransform
  • TD_NonLinearCombineTransform
  • TD_OrdinalEncodingTransform
  • TD_PolynomialFeaturesTransform
  • TD_RowNormalizeTransform
  • TD_ScaleTransform
  • TD_RandomProjectionTransform
  • TD_SentimentExtractor