CALL td_analyze (
'logisticscore',
'required_parameter_list [ optional_parameter; [...] ]'
);
- required_parameter_list
database = input_database_name;
tablename = input_table_name;
modeldatabase = model_database_name;
modeltablename = model_table_name;
outputdatabase = output_database_name;
outputtablename = output_table_name;
- optional_parameter
{ estimate = column_name |
gensqlonly = { true | false } |
index = column_name [,...] |
lifttable = { true | false } |
overwrite = { true | false } |
probability = column_name |
retain = column_name [,...] |
samplescoresize = sample_score_size |
scoringmethod = { score | evaluate | scoreandevaluate } |
successtable = { true | false } |
threshold = threshold |
thresholdbegin = threshold_begin |
thresholdend = threshold_end |
thresholdtable = { true | false }
}
Syntax Elements
- database
- The database containing the table to analyze.
- tablename
- The table containing the columns to analyze, representing the independent variables in the analysis. It must reside in the database indicated by the database parameter.
- modeldatabase
- The database containing the table representing the logistic regression model input to the analysis.
- modeltablename
- The table containing the logistic regression model that is used to score the data. It must reside in the database indicated by the modeldatabase parameter.
- outputdatabase
- The database containing the output table.
- outputtablename
- The output table containing the predicted values of the dependent variable. It must reside in the database indicated by the outputdatabase parameter.
- estimate
- [Optional] The name of the score output table column that contains the estimated value of the dependent variable. If the name is not unique, the function prepends "_tm_" to it.
- You must specify either estimate or probability.
- gensqlonly
- [Optional] True returns the SQL for the function as a result set but does not run it.
- False runs the SQL for the function but does not return it as a result set.
- Default: false
- index
- [Optional] The columns for the primary index of the score output table. These columns must form a unique key for the score output table. Otherwise, a given observation has more than one score.
- These index columns are included both in the Primary Index clause and the select list.
- Default: Primary index columns of the input table
- lifttable
- [Optional] Whether to build a lift table (a table of information required to build a lift chart) and include it in the XML output string of the function.
- Disallowed if scoringmethod is score.
- The name of the lift table is output_table_name with the suffix "_txt".
- This table splits the computed probability values into deciles with counts and percentages to demonstrate what happens when rows of ordered probabilities accumulate.
- Default: true if scoringmethod is evaluate or scoreandevaluate
- probability
- [Optional] The name of the score output table column that contains the probability that the dependent value is equal to the response value. If the name is not unique, the function prepends "_tm_" to it.
- You must specify either estimate or probability.
- retain
- [Optional] One or more input table columns to copy to the output table.
- samplescoresize
- [Optional] If scoringmethod=score or scoringmethod=scoreandevaluate, the number of output table rows to show in a sample of the result set (an integer).
- Default behavior: Function returns no sample.
- scoringmethod
- [Optional] Whether to score (only), evaluate (only), or score and evaluate.
- With the options evaluate and scoreandevaluate, the function outputs a confusion matrix table in XML format. The table includes counts of predicted and actual values of the dependent variable of the decision tree model and counts of correct and incorrect predictions.
- Default: score
- successtable
- [Optional] Whether to include the Success Table in the function XML output string, showing counts of predicted and actual values of the dependent variable of the logistic regression model.
- The Success Table is similar to the Decision Tree Confusion Matrix, but the Success Table includes only two values of the dependent variable, response and nonresponse.
- Disallowed with scoringmethod=score
- Default: true
- threshold
- [Optional] The value that determines the estimated value of the dependent variable, as follows:
Probability that Dependent Variable Value = 1 |
Estimated Dependent Variable Value |
Greater than or equal to threshold |
1 |
Less than threshold |
0 |
- Default: 0.5
- thresholdbegin
- [Optional] The beginning threshold value for the Multithreshold Success Table (see thresholdtable).
- Default: 0
- thresholdend
- [Optional] The ending threshold value for the Multithreshold Success Table.
- Default: 0
- thresholdincrement
- [Optional] The difference in threshold values between adjacent rows in the Multithreshold Success Table.
- Default: 0
- thresholdtable
- [Optional] Whether to include the Multithreshold Success Table in the function XML output string.
- Each row of the Multithreshold Success Table is a Prediction Success Table with a different threshold value, determined by thresholdbegin, thresholdend, and thresholdincrement. In this context, the threshold is the value above which the predicted probability indicates a response.
- Disallowed with scoringmethod=score
- Default: true