TD_ClassificationEvaluator Function | ClassificationEvaluator - TD_ClassificationEvaluator - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
ft:locale
en-US
ft:lastEdition
2025-04-01
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantage™

TD_ClassificationEvaluator function computes evaluation metrics to evaluate and compare multiple classification models and summarize how close predictions are to their expected values. It takes the actual and predicted values of the dependent variables to calculate specified metrics. Apart from accuracy, the secondary output table returns micro, macro and weighted averaged metrics of precision, recall and F1 score values.

Classification problems use a confusion matrix to visualize the performance of a classifier. The confusion matrix contains predicted labels represented across the row axis and actual labels represented across the column axis. Each cell in the confusion matrix corresponds to the count of occurrences of labels in the test data.

The function works for multiclass scenarios as well. In any case, the primary output table contains classlevel metrics, whereas the secondary output table contains metrics that are applicable across classes.

Apart from accuracy, the secondary output table returns micro, macro, and weighted averaged metrics of precision, recall, and F1 score values.

Classification is a type of Machine Learning algorithm where the goal is to predict a categorical variable or class label based on a set of input features. The algorithm learns to classify new observations by training on a labeled dataset, where the class labels are already known. The most common type of classification is logistic regression where the algorithm models the probability of an event taking place by having the log odds for the event to be linear combination of one or more independent variables.

In the classification process, a model is trained on a dataset consisting of input variables and corresponding categorical label. The model tries to estimate the probability of an event occurring based on a given set of input variables. There are different types of classification algorithms, including decision trees, logistic regression, naïve Bayes, and so on.

You can use the following metrics to evaluate the performance of a classification model:

  • Accuracy
  • Precision
  • Recall
  • F1 score
  • ROC curve

One crucial aspect of classification is selecting the appropriate variables. Too many variables can lead to overfitting, where the model performs well on the training data but poorly on the test data. On the other hand, too few variables can lead to underfitting, where the model fails to capture the underlying patterns in the data.