Variable Creation - INPUT - Analysis Parameters - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 2ADS Generation

Product
Teradata Warehouse Miner
Release Number
5.4.4
Published
July 2017
Language
English (United States)
Last Update
2018-05-03
dita:mapPath
fcf1492702067123.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2301
Product Category
Software
  1. On the Variable Creation dialog, click on INPUT.
  2. Click on analysis parameters.
  3. When the Basic Options tab is selected, the following screen appears.
    Variable Creation > Input > Analysis Parameters

  4. On this screen, select:
    • Group By Style
      • Group by anchor columns — Causes the anchor columns to be used as the Group By columns when one or more variables contain an aggregate function. When this is the case, all variables that don’t already contain an aggregate function are automatically changed to an aggregate by adding the MIN (minimum) function.
      • Group by all non-aggregate columns — Provides more control over the grouping characteristics of the request with the following effects.
        • A group-by clause is generated whether or not aggregation is present.
        • Every non-aggregate column is included in the group-by clause.
        • An order-by clause is generated to match the group-by clause if the output style is select or explain.
        • Aggregation is not forced on non-anchor columns, but aggregation is forced on new-style windowed OLAP functions.
        • If new windowed OLAP functions are present, positional numbers are used in the group by clause for correct syntax.
        • Old-style Teradata OLAP functions may not be used with this option.
      • User specified — Inhibits all automatic generation of a Group By clause, relying instead on that specified on the expert options tab.
    • Analysis Represents
      • Table (or derived table) — Indicates that this analysis represents either a table or derived table (a Select statement embedded in the FROM clause of another Select statement).
      • Subquery — Indicates that this analysis is referenced in another Variable Creation analysis in a Subquery element in an expert clause. Therefore, the output style of this analysis must be to select the data and not store it in the database. The analysis that references this one as a subquery may store the output, however.

        If the output option to Generate the SQL for this analysis but do not execute it is checked on the Output - storage tab, then the SQL generated for this analysis is placed directly into the SQL generated for the analysis that references this one. If however this option is not set, and the analysis is not being refreshed or published, the subquery in the referencing analysis will consist of selecting all columns from the volatile table created by this analysis.

        If this is to be a correlated subquery, the Generate the SQL for this analysis but do not execute it option must be set.

      • With seed query — Indicates that a column (any column) created by this analysis is referenced in another Variable Creation analysis in the folder under a With (Recursive) Query SQL element under the With (Recursive) Clause on the expert options tab. As a seed query, it may represent part or all of a With Query or select some or all of the starting rows for a With Recursive Query (typically from the table being analyzed, in either case). Note that if this is the first or only seed query referenced, it establishes with its analysis name and selected columns the name and select list of the With Query or With Recursive Query.

        If the output option to Generate the SQL for this analysis but do not execute it is checked on the Output - storage tab, then the SQL generated for this analysis is placed directly into the SQL generated for the analysis that references this one. If, however, this option is not set, and the base query is not being refreshed or published, the seed query in the referencing analysis will consist of selecting all columns from the volatile table created by this analysis.

      • With recursive query (Applicable to the Teradata Database) — Indicates that this analysis is referenced in another Variable Creation analysis in a With (Recursive) Query SQL element under the With (Recursive) Clause on the expert options tab. As a recursive query it forms part of a With Recursive Query. It should join the first seed query with one or more tables being analyzed by referencing a column from the seed query analysis. Note that if the join condition or conditions are specified in the Where Clause, the join style(s) under Join Paths on the anchor table tab may be set to Omit for a simpler looking query.

        As a recursive query, it must have the output option checked to Generate the SQL for this analysis but do not execute it on the Output - storage tab.

    • Special Options
      • Select Distinct — When checked, causes the Select statement to begin with ‘SELECT DISTINCT’ rather than ‘SELECT’ or ‘SELECT ALL’ (the default). Use of the Select Distinct option results in only one of each duplicate row being returned. For example, the statement ‘SELECT DISTINCT product_code FROM table_A’ will return one each of every distinct value of product_code rather than the product_code for every row in the table.

        With this option, if the Output storage option to create an output table or view is not selected, an ORDER BY clause including all non-aggregate columns will be used unless Group By Style is set to User Specified or an ORDER BY clause is specifically provided by the user.

      • Include Embedded Services Database (Applicable to Teradata Database) — Embedded Services functions are system-supplied User Defined Functions. They are made available in the “Embedded Services” category of SQL elements. By default, the database wherein a function resides is not used to qualify the function name. When this option is checked, however, a reference to one of these functions is qualified with the name of the database that contains it, usually “TDSYSFNLIB”.

        Some functions that are not in the “Embedded Services” category of SQL elements also use this option, such as ARRAY_TO_JSON, GeoJSONFromGeom, GeomFromGeoJSON, JSON_CHECK, JSON_AGG and JSON_COMPOSE.

      • Creates Score Table — When checked, identifies this analysis when refreshing or publishing as an analysis that creates a score table. Also, additional validation is performed to ensure that one and only one input table is present in this analysis, and to ensure that it does not appear in a chain of refreshed or published analyses unless it is the last analysis in the chain (i.e., the one directly referenced).

        When using this option, if input is taken from a referenced analysis that does not create a permanent output table, and this analysis is published using the Publish analysis, users of the Model Manager application will not be able to successfully execute just the scoring SQL without also executing the SQL to build the Analytic Data Set. Note further that if this scoring analysis is published, it must not contain references to a target date or to literal parameters, because the Model Manager application does not expect to find these elements in an analysis that builds a score table.

      • Normalize Option (applicable to the Teradata Database) — The following is a table of possible values for this option. Note that this option is initially available in Teradata 14.10 and will not be visible on the analysis parameters tab if the Teradata release in use is prior to 14.10.
        Normalize Options
        NORMALIZE Description
        (none) The option is not used. This is equivalent to SELECT ALL, or SELECT DISTINCT if that option is in use.
        NORMALIZE This is a modifier of the SELECT keyword that causes the returned rows to be collapsed or combined according to the first selected Period Date or Period Timestamp column, while partitioning by the other selected columns. Rows are combined only when the periods “meet” or “overlap” and the value of the combined period is akin to the union of the date or timestamp periods.
        NORMALIZE ON MEETS OR OVERLAPS This is the same as just specifying NORMALIZE.
        NORMALIZE ON OVERLAPS In this case, combining occurs only when the periods “overlap”.
        NORMALIZE ON OVERLAPS OR MEETS This is the same as just specifying NORMALIZE.
      • Table Alias Prefix — A Table Alias Prefix should be specified in a Variable Creation analysis that contains a subquery intended to be a correlated subquery. Since table alias prefixes are ordinarily assigned automatically, using the Table Alias Prefix option makes it possible to uniquely reference an input table in this analysis from another analysis containing a subquery.

        When a Table Alias Prefix is specified, the tables selected will have table aliases formed by adding an integer starting at 0 to the Table Alias Prefix. For example, if a prefix of ‘T’ is entered, the first table that a column is selected from will have a table alias of ‘T0’ (as opposed to the usual ‘_twmVC0’).

        Finally, a Table Alias Prefix may be specified solely for aesthetic purposes if desired, even if a subquery is not present. Or, if a correlated subquery is present, it may be desirable to assign a Table Alias Prefix in the subquery analysis as well as in the containing analysis where it is required, again for aesthetic purposes.

    • Analysis References
      • Depends On Analysis — When an analysis is selected from this pull-down list, it is executed before the current analysis. This allows one analysis to refer to another one without selecting an output column from the referenced analysis. One use of this option might be when the Run Units defined in one Variable Creation analysis must be executed before the current Variable Creation analysis. Another use of this option might be to merge the SQL of two Variable Creation analyses with a Set Operator, most typically a UNION operator. Refer to the Set Operator description below.
      • Set Operator — When this option is utilized, the SQL generated by the current Variable Creation analysis is combined with the SQL generated by the “Depends On Analysis” described above by way of a Set Operator such as UNION. This is an alternative to the use of the Merge analysis in the Reorganization category of functions.

        The choices for Set Operator include UNION, UNION ALL, INTERSECT, INTERSECT ALL, EXCEPT and EXCEPT ALL, where the keyword ALL simply means to retain duplicate rows in the result set. The UNION keyword indicates that rows output from both the first and second analysis should be included in the answer set. INTERSECT is used to indicate that only rows included in both input data sets should be included in the output data set. And finally, EXCEPT is used to indicate that all rows from the first data set should be included in the output data set except those occurring in the second data set.

        When a Set Operator is specified with this option, the “Depends On Analysis” must be a Variable Creation analysis with Output options to “Select” data and “Generate SQL Only”, as well as not including a WITH query. When this option is used, not only on the current analysis but also on the “Depends On Analysis”, a chain of SELECT statements separated by Set Operators can be created.
        Set Operators have precedence rules such that INTERSECT operators are executed before UNION and EXCEPT operators, and the use of parentheses that can override this are not supported by this feature.
  5. When the Temporal Qualifiers tab is selected, the following screen appears.
    Variable Creation > Input > Analysis Parameters: Temporal Qualifiers tab

    • Temporal Qualifiers (applicable to the Teradata Database) — This sub-panel is displayed only when Teradata version 13.10 or later is in use. It is useful only when one or more of the input tables are temporal tables and it is desired to specify temporal qualifiers when selecting from them. Depending on the qualifiers used, the resulting table may or may not be a temporal table, though generally it will not be. If it is, an appropriate column will be added to the result set to make it a temporal table.

      This is the only way to create a temporal table with a Variable Creation analysis due to a limitation of the Create-Table-As syntax that a VALIDTIME or TRANSACTIONTIME attribute cannot be specified using this syntax.

      At the bottom of the Temporal Qualifiers sub-panel is a selector ranging from 1 to the number of input tables plus 1. When this selector is at 1, the label to the right will indicate Select Qualifier, meaning that the qualifiers selected above it will precede the SELECT key word in the query built by the Variable Creation analysis.

      When this selector is at 2 or higher, the label to the right will indicate Table Qualifier and the name and location of the input table or the analysis name associated with a derived table. The placement of the above qualifiers in this case will be in the FROM clause of the generated query, with the exact placement dependent on whether it is the first input table, a joined table or a derived table.

      When using any of the selectors described below, if a selection is made that contains an expression, the Edit button may be used to display a replica of the Variables panel for defining an SQL expression, the text of which will be displayed in a read-only area underneath the Edit button. If the Qualify Columns check box is checked, any columns contained in these expressions will be qualified by their containing database and table names.

      • Validtime — When one or more of the input tables is a Validtime temporal table and Select Qualifier is indicated below, one of the following temporal qualifiers may be selected.
        • None
        • CURRENT VALIDTIME
        • VALIDTIME AS OF Date/Timestamp Expression
        • SEQUENCED VALIDTIME Period Expression
        • NONSEQUENCED VALIDTIME Period Expression
        When Table Qualifier is selected at the bottom of the sub-panel, one of the following temporal qualifiers may be selected.
        • None
        • VALIDTIME AS OF Date/Timestamp Expression
      • Transactiontime (applicable to Teradata Database) — When one or more of the input tables is a Transactiontime temporal table and Select Qualifier is indicated below, one of the following temporal qualifiers may be selected.
        • None
        • CURRENT TRANSACTION-TIME
        • TRANSACTIONTIME AS OF Date/Timestamp Expression
        • SEQUENCED TRANSACTIONTIME (only if output is to a View)
        • NONSEQUENCED TRANSACTIONTIME
        When Table Qualifier is selected at the bottom of the sub-panel, one of the following temporal qualifiers may be selected.
        • None
        • TRANSACTIONTIME AS OF Date/Timestamp Expression
      • As Of Date/Timestamp (applicable to the Teradata Database) — Whether Select Qualifier or Table Qualifier is selected at the bottom of the sub-panel, one of the following temporal qualifiers may be selected. Note that if an AS OF qualifier is selected here, neither a Validtime or Transactiontime qualifier may be selected for this Select statement or Table.
        • None
        • AS OF Date/Timestamp Expression