The descriptive statistics available with Teradata Warehouse Miner provide a variety of functions to statistically analyze and explore a Teradata database. Descriptive statistical analysis is valuable for the following reasons.
 It can provide business insight in its own right.
 It uncovers data quality issues which, if not corrected or compensated for, can jeopardize the accuracy of any analytic models that are based on the data.
 It isolates the data that should be used in building analytic models. For example, outlying values should sometimes be excluded from a model; in other cases, these values might be required to solve a particular business problem. Further, some statistical processes used in analytic modeling require a certain type of distribution of data.
In the case of the Descriptive Statistics, NULL values are handled through the generated SQL’s aggregate functions. In this case, SQL ignores the NULL value and adjusts the number of observations in its calculation. This effectively provides a listwise deletion of NULL values.
The following are the descriptive statistical functions currently available in Teradata Warehouse Miner:
 Adaptive Histogram — Determine the distribution of a numeric column(s) giving counts, subbinning column(s) with higher counts and determining data spikes.
 Correlation Matrix — Build and view a correlation matrix.
 Data Explorer — Automated exploration of any number of tables or views within an entire database.
 Frequency — Compute frequency of column values or multicolumn combined values. Optionally, compute frequency of values for pairs of columns in a single column list or two column lists, and generate simple statistics for any other column within a table.
 Histogram — Determine the distribution of a numeric column(s) giving counts with optional overlay counts and statistics.
 Overlap — Count overlapping column values in combinations of tables (i.e., find “key” values in common between tables).
 Scatter Plot — Plot sampled values of two to three variables in 2D or 3D.

Statistical Analysis — Determine any of the following descriptive statistics for numeric column(s):
 Minimum Value
 Maximum Value
 Mean Value
 Standard Deviation
 Skewness
 Kurtosis
 Standard Mean Error
 Coefficient of Variance
 Variance
 Sum
 Uncorrected Sums of squares
 Corrected Sums of squares
 Values Count
 Modal Value
 Percentiles
 Top 5/Bottom 5 Rank and Values
 Text Field Analyzer — Analyze character data and help distinguish whether the field is a numeric type, a date, a time, a timestamp, or character data.

Values Analysis — Count the number of values of various kinds for a given column or columns, including:
 Number of Rows
 Rows with NonNULL Values
 Rows with NULL Values
 Unique Values
 Rows with Value ‘0’
 Rows with a Positive Value
 Rows with a Negative Value
 Rows Containing Blank Values
In order to add a Descriptive Statistical analysis to a Teradata Warehouse Miner Data Mining Project, create a new analysis with any of the mechanisms described in Using Teradata Warehouse Miner. This will produce the following dialog.
Add New Analysis dialog
Doubleclick or highlight the desired analysis, optionally change the default name and click OK. Each of these specific analyses are described in detail in the subsequent sections.