Logistic Regression Scoring | Vantage Analytics Library - Logistic Regression Scoring

Logistic Regression Scoring | Vantage Analytics Library - Logistic Regression Scoring - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment

VantageCloud

VantageCore

Edition

Enterprise

IntelliFlex

Lake

VMware

Product

Vantage Analytics Library

Release Number

2.2.0

Published

March 2023

Language

English (United States)

Last Update

2024-01-02

dita:mapPath

ibw1595473364329.ditamap

dita:ditavalPath

iup1603985291876.ditaval

dita:id

zyl1473786378775

Product Category

Teradata Vantage

Once a logistic regression model has been built, it can be used to “score” new data, that is, to estimate the value of the dependent variable in the model using data for which its value may not be known. Scoring is performed using the values of the b-coefficients in the logistic regression model and the names of the independent variable column names they correspond to. This information resides in the results tables stored in the database by Analytics Library. Other information needed includes the table name in which the data resides, the new table to be created, and primary index information for the new table.

Scoring a logistic regression model requires steps beyond those required in scoring a linear regression model. The result of scoring a logistic regression model follows:

A new table containing primary index columns
The probability that the dependent variable is 1 (representing the response value) rather than 0 (representing the non-response value)
Optionally, an estimate of the dependent variable, either 0 or 1, based on a user-specified threshold value

For example, if the threshold value is 0.5, then a value of 1 is estimated if the probability value is greater than or equal to 0.5. The probability is based on the logistic regression functions given earlier.

You can achieve different results based on the threshold value applied to the probability. See Model Evaluation to determine what this threshold value should be.

Logistic Scoring applies a Logistic Regression model to a dataset containing the same columns as those used in building the model (with the exception that the scoring input table need not include the predicted or dependent variable column unless model evaluation is requested).