Computational Technique - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.4
Published
July 2017
Language
English (United States)
Last Update
2018-05-03
dita:mapPath
lov1499730320967.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2302
Product Category
Software

Unlike with linear regression, logistic regression calculations cannot be based on an SSCP matrix. Teradata Warehouse Miner therefore dynamically generates SQL to perform the calculations required to solve the model, produce model diagnostics, produce success tables, and to score new data with a model once it is built. However, to enhance performance with small data sets, Teradata Warehouse Miner provides an optional in-memory calculation feature (that is also helpful when one of the stepwise options is used). This feature selects the data into the client system’s memory if it will fit into a user-specified maximum memory amount. The maximum amount of memory in megabytes to use is specified on the expert options tab of the analysis input screen. The user can adjust this value according to their workstation and network requirements. Setting this amount to zero will disable the feature.

Teradata Warehouse Miner offers two optimization techniques for logistic regression, the default method of iteratively reweighted least squares (RLS), equivalent to the Gauss-Newton technique, and the quasi-Newton method of Broyden-Fletcher-Goldfarb-Shanno (BFGS). The RLS method is considerably faster than the BFGS method unless there are a large number of columns (RLS grows in complexity roughly as the square of the number of columns). Having a choice between techniques can be useful for more than performance reasons however, since there may be cases where one or the other technique has better convergence properties.

You may specify your choice of technique, or allow Teradata Warehouse Miner to automatically select it for you. With the automatic option the program will select RLS if there are less than 35 independent variable columns; otherwise it will select BFGS.