ConfusionMatrix Calculated Quantities | Teradata Vantage - ConfusionMatrix Calculated Quantities

ConfusionMatrix Calculated Quantities | Teradata Vantage - ConfusionMatrix Calculated Quantities - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product

Teradata Vantage

Release Number

9.02

9.01

2.0

1.3

Published

February 2022

Language

English (United States)

Last Update

2022-02-10

dita:mapPath

rnn1580259159235.ditamap

dita:ditavalPath

ybt1582220416951.ditaval

dita:id

B700-4003

lifecycle

Product Category

Teradata Vantage™

The following input table has seven observations (n = 7) and three classes (k = 3). The model correctly predicted observations 1, 2, 6, and 7 (number_of_correct_predictions = 4) .

id	observed_class	predicted_class
1	red	red
2	red	red
3	red	blue
4	red	green
5	blue	red
6	blue	blue
7	green	green

Kappa (in StatTable)

To calculate Kappa, the function uses these formulas:

Kappa = ( observed_accuracy - random_accuracy) / (1 - random_accuracy)
observed_accuracy = number_of_correct_predictions / n
random_accuracy = ( Σk (n) (number_of_correct_predictions) ) / (n * n)

For the preceding input table:

observed_accuracy = 4/7 = 0.5714
random_accuracy = ( (4*3) + (2*2) + (1*2) ) / (7*7) = 18/49 = 0.3673
Kappa = ( (4/7) - (18/49) ) / ( 1 - (18/49) ) = 0.3226

Null Error Rate (in StatTable)

The Null Error Rate is the fraction of observations that would be correctly predicted by predicting the most common observed class for all observations.

Formula: Null Error Rate = 1 - ( max (observed_class [, ...]) / n )

Example: In the preceding input table, red is observed four times, blue twice, and green once; therefore:

Null Error Rate = 1 - ( max (4,2,1) / 7 ) = 1 - (4/7) = 0.4286

Values in AccuracyTable Formulas

The formulas for the sensitivity, specificity, prevalence, detection rate, and detection prevalence of class c use these terms:

Term	Definition
ccorr	Number of correct predictions of class c.
cobs	Number of observations of class c.
cpred	Number of predictions of class c.

Sensitivity (in AccuracyTable)

Formula for sensitivity of class c:

sensitivity (c) = ccorr / cobs

Example: sensitivity (red) = 2/4 = 0.5

Specificity (in AccuracyTable)

Formula for specificity of class c:

specificity (c) = ( n + ccorr - cobs - cpred ) / ( n - cobs )

Example: specificity (red) = (7 + 2 - 4 -3 ) / (7 - 4) = 0.6667

Prevalence (in AccuracyTable)

If you specify Prevalence, the function uses the prevalence specified for each class; otherwise, the function calculates the prevalence of class c with this formula:

prevalence (c) = cobs / n

Example: prevalence (red) = 4/7 = 0.5714

Pos Pred Value (in AccuracyTable)

Formula for Pos Pred Value (positive prediction value or PPV) of class c:

PPV (c) =

( sensitivity (c) * prevalence (c) ) /

( ( sensitivity (c) * prevalence (c) ) + (1 - specificity (c) ) * (1 - prevalence (c) ) )

Example: PPV (red) = (0.5 * 0.5714) / ( (0.5 * 0.5714) + (1 - 0.6667) * (1 - 0.5714) ) = 0.6667

Neg Pred Value (in AccuracyTable)

Formula for Neg Pred Value (negative prediction value or NPV) of class c:

NPV (c) =

( specificity (c) * (1 - prevalence (c) ) ) /

( specificity (c) * (1 - prevalence (c) ) + (1 - sensitivity (c) ) * (1 - prevalence (c) ) )

Example: NPV (red) = ( ( 0.6667 * (1 - 0.5714) ) ) / ( 0.6667 * (1 - 0.5714) + (1 - 0.5) * (0.5714) ) = 0.5

Detection Rate (in AccuracyTable)

Formula for Detection Rate of class c:

Detection Rate (c) = ccorr / n

Example: Detection Rate (red) = 2/7 = 0.2857

Detection Prevalence (in AccuracyTable)

Formula for Detection Prevalence of class c:

Detection Prevalence (c) = cpred / n

Example: Detection Rate (red) = 3/7 = 0.4286

Balanced Accuracy (in AccuracyTable)

Formula for Balanced Accuracy of class c:

Balanced Accuracy (c) = ( sensitivity (c) + specificity (c) ) / 2

Example: Balanced Accuracy (red) = (0.5 + 0.6667) / 2 = 0.5833