NaiveBayesPredict Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

Input

SQL Call

DROP TABLE nb_iris_predict;

CREATE MULTISET TABLE nb_iris_predict AS (
  SELECT * FROM NaiveBayesPredict@coprocessor (
    ON nb_iris_input_test PARTITION BY ANY
    ON nb_iris_model AS model DIMENSION
    USING
    IDCol ('id')
    NumericInputs ('[1:4]')
    Responses ('virginica', 'setosa', 'versicolor')
  ) AS dt
) WITH DATA;

Output

This query returns the following table:

SELECT * FROM nb_iris_predict ORDER BY 1;

The output provides a prediction for each row in the test data set and specifies the log likelihood values that were used to make the predictions for each category.

id prediction loglik_virginica loglik_setosa loglik_versicolor
5 setosa -60.9907330174083 0.940424559067427 -38.2319825308929
10 setosa -61.5861966261907 -0.173043897170957 -37.6660830556247
15 setosa -64.7169548001753 -3.55476375390931 -42.613272284101
20 setosa -57.7992844148636 0.531796840642284 -35.7613053354934
25 setosa -55.0939143017897 -3.23703029869347 -32.1179858509341
30 setosa -58.0673073752287 0.109611164911179 -34.9285997859276
35 setosa -58.1980267787658 0.660202577013632 -34.9335988704833
40 setosa -58.3538858459019 0.976840811041703 -35.4425587940391
45 setosa -50.3847602463201 -4.36921429673761 -29.0537478266948
50 setosa -59.4745348026195 1.00257959230347 -36.5026022674224
55 versicolor -5.22108005914589 -270.465431908161 -1.7396367893394
60 versicolor -11.3356467465064 -174.565470791378 -2.31925264962004
65 versicolor -12.6496488706934 -138.435722453706 -2.1898005756116
70 versicolor -15.236843619572 -152.47255627778 -2.3538459106499
75 versicolor -8.34632493685681 -214.383653794905 -1.14727508911532
80 versicolor -18.455946984498 -109.900955754698 -3.72743011721095
85 versicolor -7.00283150694931 -249.656488976769 -2.00455589365379
90 versicolor -12.0279925543069 -177.470336291088 -1.74539749109463
95 versicolor -10.1802450220293 -198.037109900803 -1.10567314638237
100 versicolor -10.1315405651018 -187.294956922171 -1.02885306444447
105 virginica -1.58321671192447 -540.56351949849 -14.859643718252
110 virginica -6.11301966870239 -654.801984259278 -28.8385135092999
115 virginica -3.64635253153959 -456.647579953406 -15.3298808321577
120 versicolor -7.73615017754911 -322.909009762056 -3.53629430321742
125 virginica -1.87627054598219 -509.817023097936 -13.7515396871732
130 virginica -3.36908052149115 -469.802937074554 -9.13832860900173
135 versicolor -5.81482980902253 -403.678170868448 -4.51644862072851
140 virginica -1.48430911768034 -463.610989255182 -12.0238603485835
145 virginica -3.82266629516761 -576.395460020916 -22.6942168473031
150 virginica -2.57004648415525 -366.506113945482 -4.84887216455807

Prediction Accuracy

The following SQL code calculates and displays the prediction accuracy.

DROP TABLE nb_predict_accuracy;

CREATE MULTISET TABLE nb_predict_accuracy AS (
  SELECT nb_iris_input_test.id, species, prediction
  FROM nb_iris_predict, nb_iris_input_test
  WHERE nb_iris_input_test.id = nb_iris_predict.id
) WITH DATA;

SELECT (
  SELECT count(id) FROM nb_predict_accuracy
  WHERE prediction = species)/(SELECT count(id)
  FROM nb_predict_accuracy
) AS prediction_accuracy;
prediction_accuracy
0.93333333333333333333