1.0 - 8.00 - ConfusionMatrix Example - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)

Input

The input table, iris_category_expect_predict, contains 30 rows of expected and predicted values for different species of the flower iris. The predicted values can be derived from any of the classification functions, such as SparseSVMPredictor. The raw iris data set has four prediction attributes - sepal_length, sepal_width, petal_length, petal_width grouped into 3 species - setosa, versicolor, virginica.

iris_category_expect_predict
id expected_value predicted_value
5 setosa setosa
10 setosa setosa
15 setosa setosa
20 setosa setosa
25 setosa setosa
30 setosa setosa
35 setosa setosa
40 setosa setosa
45 setosa setosa
50 setosa setosa
55 versicolor versicolor
60 versicolor versicolor
65 versicolor versicolor
70 versicolor versicolor
75 versicolor versicolor
80 versicolor versicolor
85 virginica versicolor
90 versicolor versicolor
95 versicolor versicolor
100 versicolor versicolor
105 virginica virginica
110 virginica virginica
115 virginica virginica
120 versicolor virginica
125 virginica virginica
130 versicolor virginica
135 versicolor virginica
140 virginica virginica
145 virginica virginica
150 virginica virginica

SQL Call

SELECT * FROM ConfusionMatrix (
  ON iris_category_expect_predict PARTITION BY 1
  OUT TABLE CountTable (count_output)
  OUT TABLE StatTable (stat_output)
  OUT TABLE AccuracyTable (acc_output)
  USING
  ObservationColumn ('expected_value')
  PredictColumn ('predicted_value')
) AS dt;

Output

message
Success !
The result has been outputted to output tables

This query returns the following table:

SELECT * FROM count_output;
count_output
observation setosa versicolor virginica
setosa 10 0 0
versicolor 0 9 3
virginica 0 1 7

This query returns the following table:

SELECT * FROM stat_output;
stat_output
key value
Accuracy 0.8667
95% CI (0.6928, 0.9624)
Null Error Rate 0.6
P-Value [Acc > NIR] 0
Kappa 0.8
McNemar Test P-Value NA

This query returns the following table:

SELECT * FROM acc_output;
acc_output
measure virginica setosa versicolor
Balanced Accuracy 0.8693 1 0.8472
Detection Prevalence 0.3333 0.3333 0.3333
Detection Rate 0.2333 0.3333 0.3
Neg Pred Value 0.95 1 0.85
Pos Pred Value 0.7 1 0.9
Prevalence 0.2667 0.3333 0.4
Sensitivity 0.875 1 0.75
Specificity 0.8636 1 0.9444