Select Input Source

Teradata Warehouse Miner User Guide - Volume 1Introduction and Profiling

Teradata Warehouse Miner
User Guide
Teradata Profile users may skip this section.
  • Input Source — Selecting the Table option lets the user select from available databases, tables (or views) and columns in the usual manner. Selecting the Analysis option lets the user select directly from the output of another analysis of qualifying type in the current project.

    Selectable analyses include all of the Analytic Data Set and Reorganization analyses except Refresh, namely Build ADS, Variable Creation, Variable Transformation, Denorm, Join, Merge, Partition and Sample. In addition, the Free Form SQL analysis may be selected from directly when certain information has been provided in the analysis. See Free Form SQL for more information.

In place of Available Databases, the user may select from Available Analyses, including all “qualifying” analyses of the above types in the current project. An analysis does not qualify for inclusion in the pull-down list if it does not create a table or view and if the referencing analysis is an algorithm, matrix, scoring or Data Explorer analysis. This is because such analyses cannot access a volatile table for input.

Available Tables offers a list of all the output tables that will eventually be produced by the selected analysis, or a single entry with the name of the analysis under the label Volatile Table if the analysis does not create an output table or view.

Selecting an analysis as input rather than a table or view has several advantages.
  1. It ties together the analyses needed to create an analytic data set. This is a prerequisite for refreshing a data set with a possibly different target date, anchor table or output table, and for publishing the SQL to create an analytic data set in the Model Manager application.
  2. It can be used to automatically substitute a volatile table for the output of an analysis when an output table or view is not created, thus saving time and permanent space in creating data sets, as well as removing the need to name intermediate tables or views.
  3. When an analysis produces an output table or view, the analysis that reads it does not have to know the name of the table or view it is using as input.

When an analysis is executed, either individually or within the context of an entire project, any referenced analyses are automatically executed first by the application. Other considerations are given when adding or deleting an existing analysis that refers to other analyses.

One disadvantage with selecting for input an analysis that does not create a table or view (and therefore creates a volatile table) is that any values wizard designed to select column values out of the input table will typically not work. If values retrieval fails in this case, a message appears to explain why and provides suggestions. Another disadvantage is that use of the output option to create a view (in the referencing analysis) is inappropriate, since the referenced volatile table will not be available if the view is accessed later in another context. Finally, such analyses that do not create a table or view cannot be selected for input by algorithm, matrix, scoring or Data Explorer analyses.

Finally, it may be useful to note that referencing an analysis for input can also be used to embed the SQL generated by another analysis as a derived table in the referencing analysis. To achieve this result, both the referencing and the referenced analysis must first be a Denorm, Join, Merge, Partition, Sample, Variable Creation or Build ADS analysis (i.e., one of the “selectable” analyses other than Variable Transformation). A Free Form SQL analysis may also serve as the referenced analysis, as described in more detail in Free Form SQL. Further, the Output Storage option to Generate the SQL for this analysis, but do not execute it of the referenced analysis must be set to true while its option to Store the tabular output of this analysis in the database must not be set to true. Although the net effect of executing the resulting derived table should be the same as having the referenced analysis create a volatile table, it may be preferable to generate the SQL with a derived table instead (possibly to use the SQL in another context). Note that using this technique it is even possible to create nested derived tables.