In order to create a new or modify an existing Histogram analysis, the following analysis properties are defined:
Analysis Properties
- Type — “Histogram” (needed only if “new” is “true”)
- Name — the name of the new Histogram analysis or the name of an existing Histogram to modify
- New — “true” (needed to define a new Histogram analysis)
- Modify — “true” (needed to modify an existing Histogram analysis)
InputDataProperties needs to be defined if this is a “new” analysis. InputDataProperties takes a database, table, and a list of various column definitions, depending on the type of histogram desired. Histogram takes 5 types of column input. They define a collection of bins, quantiles, widths, boundaries, or bins with minimum and maximum values.
Column Input Data
- Database (required for any histogram type) — the name of the database
- Table (required for any histogram type) — the name of the table
The following defines each type of Histogram input:
BinColumns — a list of bin column names and their bin values
-
BinColumn
- Name — the name of the column
- Bins — the number of bins
An XML example to define bin columns follows.
<InputDataProperties> Database="twm_source" Table="twm_customer"> <BinColumns> <BinColumn name="age" bins="5"/> <BinColumn name="income" bins="3"/> </BinColumns> </InputDataProperties>
QuantileColumns — a list of bin column names and their bin values
-
QuantileColumn
- Name — the name of the column
- Quantiles — the number of quantiles
An XML example to define quantile columns follows.
<InputDataProperties table="twm_customer" database="twm_source"> <QuantileColumns> <QuantileColumn name="age" quantiles="5"/> <QuantileColumn name="income" quantiles="3"/> </QuantileColumns> </InputDataProperties>
WidthColumns — a list of column names and their width values
-
WidthColumn
- Name — the name of the column
- Width — the width values
An XML example to define width columns follows.
<InputDataProperties Database="twm_source" Table="twm_customer"> <WidthColumns> <WidthColumn name="age" width="10"/> <WidthColumn name="income" width="10000"/> </WidthColumns> </InputDataProperties>
BoundaryColumns — a list of column names and the list of boundary values for each column
-
BoundaryColumn
- Name — the name of the column
-
Boundaries
-
Boundary
- Value — the value of each boundary
-
Boundary
An XML example to define boundary columns follows.
<InputDataProperties Database="twm_source" Table="twm_customer"> <BoundaryColumns> <BoundaryColumn name="age"> <Boundaries> <Boundary Value="20"/> <Boundary Value="35"/> <Boundary Value="75"/> </Boundaries> </BoundaryColumn> <BoundaryColumn name="nbr_children"> <Boundaries> <Boundary Value="0"/> <Boundary Value="2"/> <Boundary Value="4"/> </Boundaries> </BoundaryColumn> </BoundaryColumns> </InputDataProperties>
BinsMinMaxColumns — a list of column names that includes the bins and minimum and maximum values
-
BinMinMaxColumn
- Name — the name of the column
- Bins — the number of bins
- MinimumValue — the minimum value to be binned for the column
- MaximumValue — the maximum value to be binned for the column
An XML example to define boundary columns follows:
<InputDataProperties Database="twm_source" Table="twm_customer"> <BinsMinMaxColumns> <binsMinMaxColumn name="age" bins="6" minimumValue="10" maximumValue="80"/> <binsMinMaxColumn name="income" bins="5"minimumValue="10000" maximumValue="100000"/> </BinsMinMaxColumns> </InputDataProperties>
An XML example to define BinColumns, StatisticsColumns and OverlayColumns for a new analysis follows.
<InputDataProperties> Database="twm_source" Table="twm_customer"> <BinColumns> <BinColumn name="age" bins="5"/> <BinColumn name="income" bins="3"/> </BinColumns> <StatisticsColumns> <StatisticsColumn name="nbr_children"/> <StatisticsColumn name="years_with_bank"/> </StatisticsColumns> <OverlayColumns> <OverlayColumn name="income"/> </OverlayColumns> </InputDataProperties>
Statistics Column Input Data (optional)
-
Statistics Columns — a list of column names of numeric or date type
- Name — the name of the column to collect statistics on
Overlay Column Input Data (optional)
-
Overlay Columns — a list of column names that cannot include any of the histogram binning columns
- Name — the name of the overlay column
Input Data Analysis Properties
- CrossTabulateBins — “true” or “false”. Default is false.
An XML example to define InputDataAnalysis properties follows.
<InputDataAnalysisProperties crossTabulateBins=”true”/>
Output Properties
For the definition of output properties, see Modifying Output Batch Properties And Post Processing Properties.
Expert Properties
- WhereClause — the where clause to be defined
An XML example to define Expert properties follows.
<ExpertProperties> whereClause=”age>50”/>
Sample XML Definition for a New Histogram Analysis
<Analysis name="MyHistogram" type="Histogram" new="true"> <InputDataProperties> Database="twm_source" Table="twm_customer"> <BinColumns> <BinColumn name="age" bins="5"/> <BinColumn name="income" bins="3"/> </BinColumns> <StatisticsColumns> <StatisticsColumn name="nbr_children"/> <StatisticsColumn name="years_with_bank"/> </StatisticsColumns> <OverlayColumns> <OveralyColumn name="income"/> </OverlayColumns> </InputDataProperties> <InputDataAnalysisProperties CrosstabulateBins=”true”/> <OutputProperties outputStyle="CreateTable" outputDatabase="twm_results" outpuName="MyHistogramOutput" </OutputProperties > </Analysis>