Running a Histogram Analysis - Teradata Warehouse Miner

Teradata® Profile Plug-in User Guide

Product
Teradata Warehouse Miner
Release Number
5.4.6
Published
November 2018
Language
English (United States)
Last Update
2018-12-07
dita:mapPath
tvw1538171534878.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2304
Product Category
Software
  1. Open a histogram analysis:
    1. In the Profile Project Explorer, right-click a histogram analysis.
    2. Select Open.
  2. Configure the data columns to be included:
    1. Select a database.
    2. Select a table.
    3. Select one or more columns to profile from the Available Columns list.
  3. [Optional] On the Overlay tab, select the desired columns to subdivide each bin. An overlay column is typically a categorical variable with only a few values. If a overlay column is specified, frequencies within each bin are calculated for each value of that overlay column. A specific column can be used in either Overlay Columns or Statistic Columns, but not both.
  4. [Optional] On the Statistics tab, select the columns to calculate basic statistics. A specific column can be used in either Overlay Columns or Statistic Columns, but not both.
  5. Configure the expert options:
    1. Click .
    2. [Optional] Enter a Where Clause to restrict rows selected by the analysis.
    3. Click OK to apply the Where Clause to the analysis.
  6. Configure the output options:
    1. Click .
    2. [Optional] To save the results to the Teradata Database, select Store Analysis in Database.
    3. Enter the Database and Table name of the results table to be created.
    4. Click OK.
  7. Configure the histogram options:
    1. Click .
    2. Select the Bin Style and enter the bin values for the columns.
      Style Values
      Bin The number of equal-sized data bins. By default 10 bins are derived for each column.
      Widths The width of each bin.
      Quantiles The number of bins to contain a nearly-equal number of values. By default, 10 bins are derived for each column.
      Boundaries The list of boundaries for each bin to start. The final value is the end of the last bin.

      Bin 0 is generated if necessary to contain data values less than the first boundary specified. Bin N+1 is generated if necessary for those data values greater than the final boundary value.

      Bins with Boundaries The number of desired equal-sized data bins, along with minimum and maximum values. By default 10 bins are derived for each column.

      Bin 0 is generated if necessary to contain data values less than the minimum specified. Bin N+1 is generated if necessary for those data values greater than the maximum.

    3. Click OK.
  8. Run the analysis by clicking . The results appear in the Results View.
  9. Load the saved results of the analysis by clicking . The Load icon is enabled only if the analysis was executed after it was saved to ensure the results match the analysis.
  10. Save the analysis by clicking . The analysis definition is saved into the metadata database for this connection.