Data Explorer

Teradata Warehouse Miner User Guide - Volume 1Introduction and Profiling

brand
Software
prodname
Teradata Warehouse Miner
vrm_release
5.4.4
category
User Guide
featnum
B035-2300-077K
The Data Explorer performs basic statistical analysis on a set of selected tables or on selected columns from selected tables in one or more databases. It stores results from four fundamental types of analysis based on simplified versions of the Descriptive Statistics analyses:
  • Values
  • Statistics
  • Frequency
  • Histogram

An answer table is produced for each requested type of analysis, the output including requested table names and column names in order to allow results from multiple tables to be included in each answer table.

Each analysis can be selected individually, with the following exceptions:
  1. If Frequency is selected, Values must be selected.
  2. If Histogram is selected, Values must be selected and Statistics must be selected including the Count, Minimum, Maximum, Mean and Standard Deviation.

Data Explorer includes intelligence about which functions should be performed on which columns, with decisions based partly on column type and partly on results obtained. It also includes performance enhancements resulting in minimal passes on the input data. You may also specify a separate SQL Where Clause to apply to each of the input tables selected for analysis.

The Data Explorer normal processing scheme is outlined below. Note that underlined values given in the following topics are threshold values which can be set by the user. The program first builds up to four output tables, then the steps below are applied to each requested input table, one at a time. If parallel processing is requested, however, the tables are, in a sense, processed n at a time, where n is the number of tables to process in parallel. That is, the program establishes n threads and performs the steps below for each input table in a separate thread until all tables are processed.

For general information about output, see OUTPUT Tab.