Creating a New or Modifying an Existing Statistics Analysis - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 1Introduction and Profiling

Product
Teradata Warehouse Miner
Release Number
5.4.4
Published
July 2017
Language
English (United States)
Last Update
2018-05-03
dita:mapPath
wbc1492033894304.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2300
Product Category
Software

In order to create a new or modify an existing Statistics analysis, the following analysis properties are defined:

Analysis Properties

  • Type — “Statistics” (needed only if “new” is “true”)
  • Name — the name of the new Statistics analysis or the name of an existing Statistics to modify
  • New — “true” (needed to define a new Statistics analysis)
  • Modify — “true” (needed to modify an existing Statistics analysis)

InputDataProperties needs to be defined if this is a “new” analysis. InputDataProperties takes a database, table, and a list of numeric or date type columns. If the analysis is being modified, the InputDataProperties can be redefined. They will replace the existing set of columns that were originally defined for the analysis.

Column Input Data

  • Database — the name of the database
  • Table — the name of the table
  • Columns — a list of numeric or date type column names
    • Name — the name of the column

An XML example to define columns for a new analysis follows.

<InputDataProperties>
	Database="twm_source"
	Table="twm_customer">
	<Columns>
		<Column name="age"/>
		<Column name="income"/>
	</Columns>
</InputDataProperties>

GroupBy Column Input Data (optional)

  • GroupBy Columns
    • Name — the name of the group by column

An XML example to define Columns and GroupBy Columns follows.

<InputDataProperties>
	Database="twm_source"
	Table="twm_customer">
	<Columns>
		<Column name="age"/>
		<Column name="income"/>
	</Columns>
	<GroupByColumns>
		<GroupByColumn name="nbr_children"/>
	</GroupBYColumns>
</InputDataProperties>

Input Data Analysis Properties

  • BasicOptions — “None”, “MinimumOptions”, “AllOptions”. Default is “MinimumOptions”.

    To define specific statistics, one of more of the following options can be selected:

    “Minimum”, “Maximum”, “Mean”, “StandardDeviation”, “Skewness”, “Kurtosis”, “StandardError”, “CoefficientOfVariance”, “Variance”, “UncorrectedSumsOfSquares”, “CorrectedSumsOfSquares”

  • ExtendedOptions — “None”, “Values”, “Quantiles”, “Rank” or “Modes”. Default is “None”.
  • StatisticalMethod — “Population” or “Sample”. Default is “Population”.

An XML example to define InputDataAnalysis properties follows.

<InputDataAnalysisProperties
	basicOptions=”allOptions”
	extendedOptions="values"
	statisticalMethod=”sample”/>

Output Properties

For the definition of output properties, see Modifying Output Batch Properties And Post Processing Properties.

Expert Properties

  • WhereClause — the where clause to be defined

An XML example to define Expert properties follows.

<ExpertProperties>
	whereClause=”age>50”/>

Sample XML Definition for a New Statistics Analysis

<Analysis name="MyStatistics" type="Statistics" new="true">
	<InputDataProperties
		Database="twm_source"
		Table=”twm_customer”
		<Columns>
			<Column name=”age”/>
			<Column name=”income”/>
		</Columns>
		<GroupByColumns>
			<GroupByColumn name=”nbr_children”/>
		</GroupByColumns>
	</InputDataProperties>
	<InputDataAnalysisProperties
		basicOptions=”allOptions”
		extendedOptions="values"
		statisticalMethod=”sample”/>
	<OutputProperties
		outputStyle="CreateTable"
		outputDatabase="twm_results"
		outpuName="MyStatisticsOutput"
	</OutputProperties>
</Analysis>