# 1.1 - 8.10 - ConfusionMatrix Calculated Quantities - Teradata Vantage

## Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

The following input table has seven observations (n = 7) and three classes (k = 3). The model correctly predicted observations 1, 2, 6, and 7 (number_of_correct_predictions = 4) .

id observed_class predicted_class
1 red red
2 red red
3 red blue
4 red green
5 blue red
6 blue blue
7 green green

## Kappa (in StatTable)

To calculate Kappa, the function uses these formulas:
• Kappa = ( observed_accuracy - random_accuracy) / (1 - random_accuracy)

• observed_accuracy = number_of_correct_predictions / n
• random_accuracy = ( Σ k (n) (number_of_correct_predictions) ) / (n * n)
For the preceding input table:
• observed_accuracy = 4/7 = 0.5714
• random_accuracy = ( (4*3) + (2*2) + (1*2) ) / (7*7) = 18/49 = 0.3673
• Kappa = ( (4/7) - (18/49) ) / ( 1 - (18/49) ) = 0.3226

## Null Error Rate (in StatTable)

The Null Error Rate is the fraction of observations that would be correctly predicted by predicting the most common observed class for all observations.

Formula: Null Error Rate = 1 - ( max (observed_class [, ...]) / n )

Example: In the preceding input table, red is observed four times, blue twice, and green once; therefore:

Null Error Rate = 1 - ( max (4,2,1) / 7 ) = 1 - (4/7) = 0.4286

## Values in AccuracyTable Formulas

The formulas for the sensitivity, specificity, prevalence, detection rate, and detection prevalence of class c use these terms:
Term Definition
c corr Number of correct predictions of class c.
c obs Number of observations of class c.
c pred Number of predictions of class c.

## Sensitivity (in AccuracyTable)

Formula for sensitivity of class c:

sensitivity (c) = c corr / c obs

Example: sensitivity (red) = 2/4 = 0.5

## Specificity (in AccuracyTable)

Formula for specificity of class c:

specificity (c) = ( n + c corr - c obs - c pred ) / ( n - c obs )

Example: specificity (red) = (7 + 2 - 4 -3 ) / (7 - 4) = 0.6667

## Prevalence (in AccuracyTable)

If you specify Prevalence, the function uses the prevalence specified for each class; otherwise, the function calculates the prevalence of class c with this formula:

prevalence (c) = c obs / n

Example: prevalence (red) = 4/7 = 0.5714

## Pos Pred Value (in AccuracyTable)

Formula for Pos Pred Value (positive prediction value or PPV) of class c:

PPV (c) =

( sensitivity (c) * prevalence (c) ) /

( ( sensitivity (c) * prevalence (c) ) + (1 - specificity (c) ) * (1 - prevalence (c) ) )

Example: PPV (red) = (0.5 * 0.5714) / ( (0.5 * 0.5714) + (1 - 0.6667) * (1 - 0.5714) ) = 0.6667

## Neg Pred Value (in AccuracyTable)

Formula for Neg Pred Value (negative prediction value or NPV) of class c:

NPV (c) =

( specificity (c) * (1 - prevalence (c) ) ) /

( specificity (c) * (1 - prevalence (c) ) + (1 - sensitivity (c) ) * (1 - prevalence (c) ) )

Example: NPV (red) = ( ( 0.6667 * (1 - 0.5714) ) ) / ( 0.6667 * (1 - 0.5714) + (1 - 0.5) * (0.5714) ) = 0.5

## Detection Rate (in AccuracyTable)

Formula for Detection Rate of class c:

Detection Rate (c) = c corr / n

Example: Detection Rate (red) = 2/7 = 0.2857

## Detection Prevalence (in AccuracyTable)

Formula for Detection Prevalence of class c:

Detection Prevalence (c) = c pred / n

Example: Detection Rate (red) = 3/7 = 0.4286

## Balanced Accuracy (in AccuracyTable)

Formula for Balanced Accuracy of class c:

Balanced Accuracy (c) = ( sensitivity (c) + specificity (c) ) / 2

Example: Balanced Accuracy (red) = (0.5 + 0.6667) / 2 = 0.5833