Using the Naïve Bayes Model with Teradata Python Package - Teradata Python Package

Teradata® Python Package User Guide

Product
Teradata Python Package
Release Number
16.20
Published
February 2020
Language
English (United States)
Last Update
2020-02-29
dita:mapPath
rkb1531260709148.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
B700-4006
lifecycle
previous
Product Category
Teradata Vantage
This section uses the "iris" dataset for illustration. The dataset contains 3 classes of 50 instances each, where each class refers to a type of iris plant.

In this example, you separate the "iris" dataset into training dataset and test dataset, then build a Naïve Bayes Classifier model based on the training dataset and apply the model to the test dataset to evaluate the performance of the model.

It is assumed that the "iris" dataset is already in the database table "nb_input_iris".

  1. Import the required modules.
    from teradataml.analytics.mle.NaiveBayes import NaiveBayes
    
    from teradataml.analytics.sqle.NaiveBayesPredict import NaiveBayesPredict
    
    from teradataml.dataframe.dataframe import DataFrame
  2. Create a teradataml DataFrame "nb_iris_input_train" for the training dataset from the "nb_input_iris" table.
    nb_iris_input_train = DataFrame.from_query("SELECT * FROM nb_input_iris WHERE id MOD 5 <> 0")
  3. Train a new Naïve Bayes Classifier model based on the teradataml DataFrame "nb_iris_input_train" from the training dataset, using the NaiveBayes function from teradataml package.
    # Run the train function
    naivebayes_train = NaiveBayes(formula="species ~ petal_length + sepal_width + petal_width + sepal_length", data=nb_iris_input_train)
    Once the model is created, you can apply the model to the test dataset.
  4. Create a teradataml DataFrame "nb_iris_input_test" for the test dataset from the "nb_input_iris" table.
    nb_iris_input_test = DataFrame.from_query("SELECT * FROM nb_input_iris WHERE id MOD 5 = 0")
  5. Predict the flower type by applying the Naïve Bayes model to the teradataml DataFrame "nb_iris_input_test" from the test dataset, using the NaiveBayesPredict.
    # Generate prediction using output of train function
    naivebayes_predict_result = NaiveBayesPredict(newdata=nb_iris_input_test,
                                           modeldata = naivebayes_train,
                                           id_col = "id",
                                           responses = ["virginica","setosa","versicolor"]
                                           )
  6. Inspect the results.
    naivebayes_predict_result