Linear Regression Scoring is the application of a Linear Regression model to an input table that contains the same independent variable columns contained in the model. The result is an output score table that minimally contains one or more key columns and an estimate of the dependent variable in the model. The user may also choose to perform model evaluation, either separately or in combination with scoring. When requested, a report is produced as a result data set containing the standard error of estimate as well as the minimum, maximum and average absolute error. When model evaluation is requested, the input table must contain a column representing the dependent variable in the model. When both scoring and evaluation are requested, the output table will automatically include the residual value, calculated as the difference between the original value and the predicted value of the dependent variable. The residual value may also be requested when only scoring is performed.
- If one or more group by columns are present in the input table to be scored and the model input table, each row in the input table to be scored is scored using the appropriate model in the model input table.
- If an error such as “Constant columns detected” occurs for a particular combination of group by column values, the predicted value of the dependent column will be null for any row containing that combination of group by column values. The error message will also be placed in the column name in the model report.
To execute the stand-alone version of the linear regression algorithm or to score a model built by this algorithm the td_analyze stored procedure must be installed on the Teradata system, with appropriate permissions granted. Refer to In-Database Analytic Function Setup for instructions on how to install td_analyze.