Example: How to Use TD_NaiveBayesPredict - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Every complete example in this document is available in a zip file that you can download. The zip file includes a SQL script file that creates the input tables for the examples. If you are reading this document on https://docs.teradata.com/, you can download the zip file from the attachment in the left sidebar.

This section shows the input table, SQL query, and output tables of an example using TD_NaiveBayesPredict.

The following table contains a subset of housing dataset.
Only part of the dataset is shown in this example.
DROP TABLE housing_test;
CREATE TABLE housing_test (sn integer, price real, lotsize real, bedrooms  integer, bathrms integer, stories integer, driveway varchar(10), recroom varchar(10), fullbase varchar(10), gashw varchar(10), airco varchar(10), garagepl integer, prefarea varchar(10), homestyle varchar(10)) PRIMARY INDEX(sn);
insert into housing_test values(13,27000,1700,3,1,2,'yes','no','no','no','no',0,'no','Classic');
insert into housing_test values(16,37900,3185,2,1,1,'yes','no','no','no','yes',0,'no','Classic');
insert into housing_test values(25,42000,4960,2,1,1,'yes','no','no','no','no',0,'no','Classic');
insert into housing_test values(38,67000,5170,3,1,4,'yes','no','no','no','yes',0,'no','Eclectic');
insert into housing_test values(53,68000,9166,2,1,1,'yes','no','yes','no','yes',2,'no','Eclectic');
...
SELECT * FROM housing_test ORDER BY 1;

TD_NaiveBayesPredict SQL Call

SELECT * FROM TD_NaiveBayesPredict (
  ON housing_test AS InputTable
  ON housing_train_model AS ModelTable DIMENSION
  USING
  IDColumn ('sn')
  NumericInputs('price','lotsize','bedrooms','bathrms','stories','garagepl')
  CategoricalInputs('driveway','recroom','fullbase','gashw','airco','prefarea')
  Responses ('Classic', 'Eclectic', 'bungalow')
  OutputProb('t')
  Accumulate('homestyle')
) AS dt ORDER BY sn;
TD_NaiveBayesPredict Output Table
sn Prediction Loglik_Classic Loglik_Eclectic Loglik_bungalow Prob_Classic Prob_Eclectic Prob_bungalow homestyle
13 Classic -25.2150421 -30.7597 -44.04708467 0.99610712 0.00389288 .00000001 Classic
16 Classic -24.502771 -29.6444 -42.43191857 0.99418596 0.00581403 .00000002 Classic
25 Classic -22.1674829 -27.7935 -41.38195573 0.99640986 0.00359014 .00000000 Classic
38 Eclectic -40.7803961 -28.2814 -33.56134748 0.00000371 0.99492952 .00506677 Eclectic
53 Eclectic -42.1760628 -28.75 -35.44498472 0.00000147 0.99876298 .00123555 Eclectic
... ... ... ... ... ... ... ... ...

As the data is in dense format, you need to use TD_Unpivoting function to change the data to sparse format.

DROP TABLE housing_test_sparse;
CREATE MULTISET TABLE housing_test_sparse AS( 
SELECT * FROM TD_UNPIVOTING(
ON housing_test AS InputTable
USING
IDColumn('sn')
TargetColumns('price','lotsize','bedrooms','bathrms','stories','garagepl','driveway','recroom','fullbase','gashw','airco','prefarea')
AttributeColName('AttributeName') 
ValueColName('AttributeValue')
Accumulate('homestyle')
) AS dt) WITH data;

TD_NaiveBayesPredict SQL Call

SELECT * FROM TD_NaiveBayesPredict (
  ON housing_test_sparse AS InputTable
  ON housing_train_model_sparse AS ModelTable DIMENSION
  USING
  IDColumn ('sn')
  AttributeNameColumn('AttributeName') 
  AttributeValueColumn('AttributeValue')
  Responses ('Classic', 'Eclectic', 'bungalow')
  OutputProb('t')
  Accumulate('homestyle')
) AS dt ORDER BY sn;
TD_NaiveBayesPredict Output Table
sn Prediction Loglik_Classic Loglik_Eclectic Loglik_bungalow Prob_Classic Prob_Eclectic Prob_bungalow homestyle
13 Classic -25.2150421 -30.7597 -44.04708467 0.99610712 0.00389288 .00000001 Classic
16 Classic -24.502771 -29.6444 -42.43191857 0.99418596 0.00581403 .00000002 Classic
25 Classic -22.1674829 -27.7935 -41.38195573 0.99640986 0.00359014 .00000000 Classic
38 Eclectic -40.7803961 -28.2814 -33.56134748 0.00000371 0.99492952 .00506677 Eclectic
53 Eclectic -42.1760628 -28.75 -35.44498472 0.00000147 0.99876298 .00123555 Eclectic
... ... ... ... ... ... ... ... ...