1.1 - 8.10 - ConfusionMatrix Example - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

Input

The input table, iris_category_expect_predict, contains 30 rows of expected and predicted values for different species of the flower iris. The predicted values can be derived from any of the classification functions, such as SVMSparsePredict_MLE. The raw iris data set has four prediction attributes - sepal_length, sepal_width, petal_length, petal_width grouped into 3 species - setosa, versicolor, virginica.

iris_category_expect_predict
id expected_value predicted_value
5 setosa setosa
10 setosa setosa
15 setosa setosa
20 setosa setosa
25 setosa setosa
30 setosa setosa
35 setosa setosa
40 setosa setosa
45 setosa setosa
50 setosa setosa
55 versicolor versicolor
60 versicolor versicolor
65 versicolor versicolor
70 versicolor versicolor
75 versicolor versicolor
80 versicolor versicolor
85 virginica versicolor
90 versicolor versicolor
95 versicolor versicolor
100 versicolor versicolor
105 virginica virginica
110 virginica virginica
115 virginica virginica
120 versicolor virginica
125 virginica virginica
130 versicolor virginica
135 versicolor virginica
140 virginica virginica
145 virginica virginica
150 virginica virginica

SQL Call

SELECT * FROM ConfusionMatrix(
  ON iris_category_expect_predict PARTITION BY 1
  OUT TABLE CountTable(count_output)
  OUT TABLE StatTable(stat_output)
  OUT TABLE AccuracyTable(acc_output)
  USING
  ObservationColumn('expected_value')
  PredictColumn('predicted_value')
) AS dt;

Output

 message                                        
 ---------------------------------------------- 
 Success !                                     
 The result has been outputted to output tables
SELECT * FROM count_output;
 observation setosa versicolor virginica 
 ----------- ------ ---------- --------- 
 versicolor  0      9          3        
 setosa      10     0          0        
 virginica   0      1          7        
SELECT * FROM stat_output;
 key                  value            
 -------------------- ---------------- 
 95% CI               (0.6928, 0.9624)
 P-Value [Acc > NIR]  0               
 Mcnemar Test P-Value NA              
 Accuracy             0.8667          
 Null Error Rate      0.6             
 Kappa                0.8             
SELECT * FROM acc_output;
 measure              virginica setosa versicolor 
 -------------------- --------- ------ ---------- 
 Specificity          0.8636    1      0.9444    
 Neg Pred Value       0.95      1      0.85      
 Detection Rate       0.2333    0.3333 0.3       
 Balanced Accuracy    0.8693    1      0.8472    
 Sensitivity          0.875     1      0.75      
 Pos Pred Value       0.7       1      0.9       
 Prevalence           0.2667    0.3333 0.4       
 Detection Prevalence 0.3333    0.3333 0.3333

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.