Changes and Additions | Teradata Vantage 2.4/17.20 - Changes and Additions - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-10-04
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢
Date Release Description
September 2024 17.20.30 New Features
  • TD_NERExtractor

    Perform Named Entity Recognition (NER) on input text according to user-defined dictionary words or regular expression (regex) patterns. See TD_NERExtractor.

August 2024 17.20.29 New Features:
  • TD_TextMorph

    Generate morphs (standard form/Dictionary form) of the given tokens in the input dataset using lemmatization algorithm based on the English Dictionary. See TD_TextMorph.

Enhancements:
  • NGramSplitter: Added support for regular expressions (regex). See TD_Ngramsplitter.
  • TD_TextParser: Added ability to count the occurrences of each token or stem (TokenFrequency) and obtain a comma separated list of positions for each token occurrence (ListPositions). See TD_TextParser.
July 2024 17.20.03.28 New Features:
  • TD_CFilter

    Calculate several statistical measures of how likely each pair of items is to be purchased together. See TD_CFilter.

  • TD_NaiveBayes

    Predicts the outcome of future observations based on their input variable values. You can use this function for solving classification problems. See TD_NaiveBayes.

  • TD_NaiveBayesPredict

    Uses the model generated by the TD_NaiveBayes function to predict the outcomes for a test set of data. See TD_NaiveBayesPredict

May 2024 17.20.03.26

New Features:

  • TD_SHAP

    Compute the contribution of each feature in a prediction as as average marginal contribution of the feature value across all possible coalitions. See TD_SHAP.

Updated Information:

The following functions were reorganized and updated content:

TD_DecisionForest. Best practice when ModelType is set to Classification. See TD_DecisionForest.

March 2024 17.20.03.24

Updated Information:

  • TD_XGBoostPredict. Removed Detailed argument from syntax and tree_num from output table schema.
February 2024 17.20.03.23
Updated Information:
January 2024 17.20.03.22 Enhancements:
  • TD_ANOVA, TD_FTest, and TD_ZTest functions support alternate input table format (group-value), in addition to the already supported column-wise format.

    See TD_ANOVA, TD_FTest, and TD_ZTest.

  • TD_ScaleFit and TD_ScaleTransform functions accept input data in sparse format, in addition to the already supported dense format. You can perform scaling on input dataset which is present in sparse format and perform scaling on greater number of attributes.

    See TD_ScaleFit and TD_ScaleTransform.

Updated Information:
  • Use ASCII client collation when executing analytic functions. If your data has UNICODE characters, set the UTF8 client character set and Collation set to ASCII for the user session. See Recommendations for Using Analytic Functions.
December 2023 17.20.03.21 New Features:
  • TD_TFIDF

    Commonly used to determine how important a word is in relation to a document. See TD_TFIDF.

November 2023 17.20.03.20 Enhancements:
  • TD_XGBoost

    Performance and usability enhancements like supporting alternate argument names for NumBoostedTrees, ShrinkageFactor, and IterNum (inline with open source libraries). Some additional changes like added BaseScore argument in the function syntax, default values for ShrinkageFactor changed to 0.5 and RegularizationLambada changed to 1, and so on.

    See TD_XGBoost.

  • TD_XGBoostPredict

    Performance and usability enhancements like supporting alternate argument names for NumBoostRounds and IterNum (inline with open source libraries). Some additional changes like order by argument made optional, default values for IterNum increased from 3 to 10, and so on.

    See TD_XGBoostPredict.

October 2023
  • 17.20.03.19
  • 17.20.03.18
New Features:
  • TD_Pivoting

    Pivot the data, that is, changes the data from sparse to dense format. See TD_Pivoting.

  • TD_Unpivoting

    Unpivots the data, that is, changes the data from dense format to sparse format. See TD_Unpivoting.

Enhancements:
  • TD_CategoricalSummary

    DistinctValue column supports VARCHAR (CHARACTER SET LATIN or UNICODE) data type in the output table schema. See TD_CategoricalSummary

  • TD_DecisionForestPredict

    Performance and usability enhancements like changing accumulate columns at the end of the output. See TD_DecisionForestPredict.

  • TD_GLMPredict

    Optional FamilyTD_GLMPredict.

  • TD_KMeans

    TD_KMeans supports KMeans++ algorithm for initial centroids selection.

    KMeans++ algorithm is a way for choosing initial centroids far away from each other and reduces the possibility of initial centroids being chosen from the same cluster. KMeans++ improves the overall quality of clustering and can also speed up the convergence of KMeans algorithm.

    See TD_KMeans.

  • TD_KMeansPredict

    TD_KMeansPredict takes a table of cluster centroids output by the TD_KMeans function and an input table. It uses the model to assign the input data points to the cluster centroids.

    See TD_KMeansPredict.

  • TD_Scalefit and TD_ScaleTransform

    Supports PARTITION BY along with ParameterTable and AttributeTable to scale different input data partitions independently of each other. Added an optional argument IgnoreInvalidLocationScale that gives you an option to ignore errors for invalid values of location and scale parameters.

    See TD_ScaleFit and TD_ScaleTransform.

  • TD_SVMPredict

    Optional ModelType argument added to TD_SVMPredict. The argument specifies the model type used by TD_SVM to train the dataset. See TD_SVMPredict.

Updated Information:
September 17.20.03.17 Updated Information:
August 17.20.03.16 Updated Information:
  • TD_GLM and TD_GLMPerSegment have been combined. TD_GLM supports PARTITION BY ANY and PARTITION BY partition_by_column. Use TD_GLM for model training. See TD_GLM.
  • TD_GLMPredict and TD_GLMPredictPerSegment have been combined. TD_GLMPredict supports PARTITION BY ANY and PARTITION BY partition_by_column. Use TD_GLMPredict for model scoring. See TD_GLMPredict.
July 2023 17.20.03.15 Updated Information:
June 2023 17.20.03.14 Updated Information:

Enhancements:

May 2023 17.20.03.13 Updated Information:
  • Information about the size of a query allowed for analytic functions is added in Usage Notes. See Size of the Query.
  • TD_XGBoost. The default value for the CoverageFactor argument is 1.0.
  • TD_WordEmbeddings. The default value for the Operation argument is token-embedding, and the default value for the StemTokens argument is False.
  • TD_GLM, TD_SVM, TD_OneClassSVM. The value for the BatchSize and IterNumNoChange arguments is a non-negative integer.
  • TD_KMeansPredict. Reorganized and updated content.
April 2023 17.20.03.12 Updated Information:
March 2023 17.20.03.11 New Features:
  • TD_TargetEncodingFit. Generates hyperparameters for use by TD_TargetEncodingTransform.
  • TD_TargetEncodingTransform. Uses the hyperparameters generated by TD_TargetEncodingFit to encode categorical values.
  • TD_GLMPerSegment. Trains a whole data set by partitioning, and creates a single model for each partition.
  • TD_GLMPredictPerSegment. Predicts target value (regression) and class label (classification) for test data using a corresponding GLM model trained using TD_GLMPerSegment.

Updated Information:

  • TD_XGBoost. For Classification, the SELECT statement must have a deterministic output.
  • TD_ANOVA. One-Way ANOVA is supported. Syntax elements and Output Table schema are updated.
  • [Legacy - Removed] SVMSparsePredict. Updated SQL Call in function example.
February 2023 17.20.03.10 Information about non-deterministic behavior of function output is added in Usage Notes. See Non-Deterministic Behavior.
January 2023 17.20.03.09 New Features:
  • TD_DecisionForestPredict. Uses the model output by TD_DecisionForest function to analyze the input data and make predictions.
  • TD_TrainTestSplit. Simulates how a model performs on new data.
  • TD_XGBoost. Performs classification and regression analysis on data sets and generates a model for TD_XGBoostPredict to run the predictive algorithm.
  • TD_XGBoostPredict. Runs the predictive algorithm based on the model generated by TD_XGBoost.

Updated Information:

December 2022 17.20.03.08 New Features:

Enhancements:

  • nPath supports CLOB output.
November 2022 17.20.03.07 New Features:

Enhancements:

  • TD_OrdinalEncodingFit: Multiple column support added. Changed syntax, CategoryTable and Output Table schema, and examples.
  • TD_OrdinalEncodingTransform: Multiple column support added. Changed FitTable and Output Table schema, and examples.
  • TD_OneHotEncodingFit: Multiple column support added. Changed syntax, Output Table schema for dense input, and examples. Added CategoryTable schema for dense input.
  • TD_OneHotEncodingTransform: Multiple column support added. Changed FitTable and Output Table schema, and examples.
  • TD_Histogram: Multiple column support added. Changed syntax, MinMaxTable and Output Table schema, and examples.
Oct 2022 17.20.03.06 TD_KNN
June 2022 17.20.03.01 New Features: