5.4.5 - Linear Regression - INPUT - Data Selection - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty
  1. On the Linear Regression dialog box, click INPUT.
  2. Click data selection.
    Linear Regression > Input > Data Selection

  3. On this screen, select:
    • Select Input Source — Users may select between different sources of input, Table, Matrix or Analysis.

      By selecting the Input Source Table, the user can select from available databases, tables (or views) and columns in the usual manner. (In this case a matrix will be dynamically built and discarded when the algorithm completes execution).

      By selecting the Input Source Matrix, the user may can select from available matrices created by the Build Matrix function. This has the advantage that the matrix selected for input is available for further analysis after completion of the algorithm, perhaps selecting a different subset of columns from the matrix.

      By selecting the Input Source Analysis, the user can select directly from the output of another analysis of qualifying type in the current project. (In this case, a matrix will be dynamically built and discarded when the algorithm completes execution). Analyses that may be selected from directly include all of the Analytic Data Set (ADS) and Reorganization analyses (except Refresh). In place of Available Databases the user may select from Available Analyses, while Available Tables then contains a list of all the output tables that will eventually be produced by the selected analysis.

      Since this analysis cannot select from a volatile input table, Available Analyses will contain only those qualifying analyses that create an output table or view.

    • Select Columns From One Table
      • Available Databases (only for Input Source equal to Table) — All the databases which are available for the Linear Regression analysis.
      • Available Matrices (only for Input Source equal to Matrix) — When the Input source is Matrix, a matrix must first be built with the Build Matrix function before linear regression can be performed. Select the matrix that summarizes the data to be analyzed.
        The matrix must have been built with more rows than selected columns or the Linear Regression analysis produces a singular matrix, causing a failure.
      • Available Analyses (only for Input Source equal to Analysis) — All the analyses that are available for the Linear Regression analysis.
      • Available Tables (only for Input Source equal to Table or Analysis) — All the tables that are available for the Linear Regression analysis.
      • Available Columns — All the columns that are available for the Linear Regression analysis.
      • Selected Columns — Select columns by highlighting and then either dragging and dropping into the Selected Columns window, or click on the arrow button to move highlighted columns into the Selected Columns window.
        The Selected Columns window is a split window; you can either insert columns as Dependent or Independent columns. Make sure you have the correct portion of the window highlighted. The Dependent variable column is the column whose value is being predicted by the linear regression model. The algorithm requires that the Dependent and Independent columns must be of numeric type (or contain numbers in character format).