Creating a New or Modifying an Existing Data Explorer Analysis - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 1Introduction and Profiling

Product
Teradata Warehouse Miner
Release Number
5.4.4
Published
July 2017
Language
English (United States)
Last Update
2018-05-03
dita:mapPath
wbc1492033894304.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2300
Product Category
Software

In order to create a new or modify an existing Data Explorer analysis, the following analysis properties are defined.

Analysis Properties

  • Type — “Data Explorer” (needed only if “new” is “true”)
  • Name — the name of the new Data Explorer analysis or the name of an existing data explorer analysis to modify
  • New — “true” (needed to define a new Data Explorer analysis)
  • Modify — “true” (needed to modify an existing Data Explorer analysis)

InputDataProperties need to be defined if this is a new analysis. InputDataProperties can take a list of tables (multi-table) or a list of columns. If the analysis is being modified, the InputDataProperties can be redefined. They will replace the existing set of tables or columns that were originally defined for the analysis.

If multi-table input is used, the Table Properties need to define the name of the tables and databases and an optional where clause associated with each table. For each table to be defined, the following table attributes are available.

Input Data Properties (multi-table input)

  • Name — the name of the existing table
  • Database — the name of the database that the table is defined in
  • WhereClause (optional) — the value of the where clause associated with the database/table

An XML example to define multi-tables for a new analysis follows.

<InputDataProperties>
	<Tables>
		<Table name="twm_customer" database="twm_source"/>
		<Table name="twm_checking_acct" database="twm_source" 
			WhereClause="cust_id > 0"/>
	</Tables>
</InputDataProperties>

If column input is used, the Column Properties need to define the name of the tables and databases. For each column to be defined, the following column attributes follow.

Input Data Properties (column input)

  • Name — the name of the column
  • Table — the name of the table associated with the column
  • Database — the name of the database that the table is defined in

An XML example to define columns for a new analysis follows.

<InputDataProperties>
	<Columns>
		<Column name="age" table="twm_customer" database="twm_source"/>
		<Column name="income" table="twm_customer" database="twm_source"/>
		<Column name="income" table="twm_customer_analysis" database="twm_source"/>
	</Columns>
</InputDataProperties>

Input Data Analysis properties can be specified to override the default values for a new Data Explorer analysis or to modify the values of an existing Data Explorer analysis.

Input Data Analysis Properties

  • ComputeUniques — “true” to get the unique values for the values analysis. Default value is “false”.
  • StatisticsOptions — “minimumSet” or “all” for the statistics analysis. Default is “minimumSet”.
  • StatisticalMethod — “population” or “sample” used for the statistics analysis. Default is “population”.
  • HistogramStyle — “bins” or “quantiles” used for the histogram analysis. Default is “bins”.

An XML example to define InputDataAnalysis properties follows.

<InputDataAnalysisProperties
	computeUniques="true"
	statisticsOptions="minimumSet"
	statisticalMethod="population"
	histogramStyle="bins"/>

Output properties can be specified to override the default names of the output tables for the Data Explorer analysis, as well as defining or modifying other output properties.

Output Properties

  • OutputDatabase — name of existing database
  • OutputFrequencyTableName — name of frequency output table to be created
  • OutputHistogramTableName — name of histogram output table to be created
  • OutputStatisticsTableName — name of statistics output table to be created
  • OutputValuesTableName — name of values output table to be created
  • Fallback — “true” or “false” (default is false)
  • Restart — “true” or “false” (default is false)
  • StoredProcedureEnabled — “true” or “false” (default is false)
  • ProcedureName — name of stored procedure to be created after execution, if StoredProcedureEnabled is set to true
  • ProcedureComment — comment associated with stored procedure to be created after execution, if StoredProcedureEnabled is set to true
  • AdvertiseOutput — “true” or “false”
  • AdvertiseNote — optional comment or note for advertised output (only used if AdvertiseOutput is “true” or if the Connection property AlwaysAdvertise is “true”)

An XML example to define the OutputProperties follows.

<OutputProperties
	outputDatabase="twm_results"
	outputFrequencyTableName="MyFreq"
	outputHistogramTableName="MyHisto"
	outputStatisticsTableName="MyStats"
	outputValuesTableName="MyValues"/>

Expert Properties

For the Data Explorer analysis, the expert where clauses are defined in a separate XML structure when using column input for the analysis. Where clauses take the following attributes and values:

  • Table — name of table associated with an input table defined for a column in the XML Columns structure
  • Database — name of database that the table is defined in
  • WhereClause — the where clause to be used

An XML example to define the Expert option where clause(s) follows.

<ExpertProperties>
	<WhereClauses>
		<WhereClause table="twm_customer" database="twm_source" where="cust_id >" />
		 <WhereClause table="twm_customer_analysis" database="twm_source" 
			where="age > 50" />
	</WhereClauses>
 </ExpertProperties>

Sample XML Definition for a New Data Explorer Analysis

An XML example defining a new Data Explorer analysis using Multi-table input follows.

<Analysis name="My New DataExplorer" type="Data Explorer" new="true">
	<InputDataProperties>
		<Tables>
			<Table name="twm_customer" database="twm_source"/>
			<Table name="twm_checking_acct" database="twm_source" 
				WhereClause="cust_id > 0"/>
		</Tables>
	</InputDataProperties>
	<InputDataAnalysisProperties
		computeUniques="true"
		statisticsOptions="minimumSet"
		statisticalMethod="population"
		histogramStyle="bins"/>
	<OutputProperties
		outputDatabase="twm_results"
		outputFrequencyTableName="MyFreq"
		outputHistogramTableName="MyHisto"
		outputStatisticsTableName="MyStats"
		outputValuesTableName="MyValues"/>
</Analysis> 

Sample XML Definition to Modify an Existing Data Explorer Analysis

<Analysis name="My data explore Columns" type="Data Explorer" modify="true">
	<InputDataProperties>
		<Columns>
			<Column name="nbr_children" table="twm_customer" database="twm_source"/>
			<Column name="age" table="twm_customer_analysis" database="twm_source"/>
		</Columns>
	</InputDataProperties>
	<InputDataAnalysisProperties
		computeUniques="true"
		statisticsOptions="a;;"
		statisticalMethod="sample"
		histogramStyle="quantiles"/>				
	<OutputProperties
		OutputDatabase="twm_results"
		OutputFrequencyTableName="My Freq5"
		OutputHistogramTableName="M yHisto5"
		OutputStatisticsTableName="My Stats5"
		OutputValuesTableName="My Values5"/>
	<ExpertProperties>
		<WhereClauses>
			 <WhereClause table="twm_customer_analysis" database="twm_source" 
				where="age > 50" />
		 </WhereClauses>
	</ExpertProperties>
</Analysis>