Adaptive Histogram - INPUT - Analysis Parameters - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 1Introduction and Profiling

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-03
dita:mapPath
apa1503087321277.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2300
Product Category
Software
  1. On the Adaptive Histogram dialog box, click on INPUT.
  2. Click on analysis parameters.
    Adaptive Histogram > Input > Analysis Parameters

  3. On this screen select:
    • Adaptive Histogram Options
      • Spike Threshold — A percentage of rows, expressed as an integer (1 to 100), above which an individual value of a variable will be identified as a separate bin. The default percentage is 10, (i.e., 10% of the total number of rows). Values that have this or a larger percentage of rows are identified as a Spike.
      • Subdivision Threshold — A percentage of rows, expressed as an integer (0 to 100), above which a bin will be subdivided into sub-bins. The default percentage is 30, (i.e., 30% of the total number of rows). Bins that have this or a larger percentage of rows are subdivided into sub-bins using an algorithm that uses means and standard deviations.
    • Subdivision Method
      • Means — Option to subdivide overpopulated bins using means and standard deviations.
      • Quantiles — Option to subdivide overpopulated bins using quantiles.
    • Bin Values for Selected Columns — Each column selected for the Adaptive Histogram analysis appears in this list, along with the default bin values, depending upon the Bin Style selected. Next to Column Name, the following will appear:
      • Bins — If Bins is selected, 10 appears as the number of bins to generate next to the column selected for the Histogram analysis. Click on the Change… button to change the desired number of equal sized data bins. Entry must be an integer greater than 0.