Statistical Analysis Functions | Teradata Vantage - Statistical Analysis - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
Function Description
Approximate Cardinality (ML Engine) Computes the approximate global distinct count of the values in one or more columns, scanning the table only once. Counts all children for a specified parent.
Approximate Percentile (ML Engine) Computes approximate percentiles for one or more columns, with specified accuracy.
ConfusionMatrix (ML Engine) Shows how often a classification algorithm correctly classifies items.
Correlation (ML Engine) Computes the global correlation between any pair of table columns.
CrossValidation (ML Engine) Validates a GLM model by assessing how the results of a statistical analysis generalize to an independent data set.
CrossValidation2 (ML Engine) Validates a GLML1L2 model by assessing how the results of a statistical analysis generalize to an independent data set.
Distribution Matching (ML Engine) Uses hypothesis testing to find the best matching distribution for data.
FMeasure (ML Engine) Calculates the accuracy of a test.
Histogram (ML Engine) Calculates the frequency distribution of a data set using sophisticated binning techniques that can automatically calculate the bin width and number of bins. The function maps each input row to one bin and returns the frequency (row count) and proportion (percentage of rows) of each bin.
LikelihoodRatioTest (ML Engine) Performs the likelihood ratio test for two GLM models.
Percentiles (ML Engine) Finds percentiles on a per group basis.
Receiver Operating Characteristic (ROC) (ML Engine) Takes a set of prediction-actual pairs for a binary classifier and calculates the TPR, FPR, AUC, and Gini coefficient for a range of thresholds.
VectorDistance (ML Engine) Measures the distance between sparse vectors (for example, TF-IDF vectors) in a pairwise manner.
UnivariateStatistics (ML Engine) Calculates descriptive statistics for a set of target columns.
Principal Component Analysis (PCA) Functions (ML Engine) Common unsupervised learning technique useful for both exploratory data analysis and dimensionality reduction.