Logistic Regression Scoring | Vantage Analytics Library - Logistic Regression Scoring - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Vantage Analytics Library
Release Number
2.2.0
Published
March 2023
Language
English (United States)
Last Update
2024-01-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage

Once a logistic regression model has been built, it can be used to “score” new data, that is, to estimate the value of the dependent variable in the model using data for which its value may not be known. Scoring is performed using the values of the b-coefficients in the logistic regression model and the names of the independent variable column names they correspond to. This information resides in the results tables stored in the database by Analytics Library. Other information needed includes the table name in which the data resides, the new table to be created, and primary index information for the new table.

Scoring a logistic regression model requires steps beyond those required in scoring a linear regression model. The result of scoring a logistic regression model follows:
  • A new table containing primary index columns
  • The probability that the dependent variable is 1 (representing the response value) rather than 0 (representing the non-response value)
  • Optionally, an estimate of the dependent variable, either 0 or 1, based on a user-specified threshold value
For example, if the threshold value is 0.5, then a value of 1 is estimated if the probability value is greater than or equal to 0.5. The probability is based on the logistic regression functions given earlier.

You can achieve different results based on the threshold value applied to the probability. See Model Evaluation to determine what this threshold value should be.

Logistic Scoring applies a Logistic Regression model to a dataset containing the same columns as those used in building the model (with the exception that the scoring input table need not include the predicted or dependent variable column unless model evaluation is requested).