5.4.4 - Teradata Profiler Plug-In - Teradata Warehouse Miner

Teradata Profiler Plug-in User Guide

Teradata Warehouse Miner
Release Number
July 2017
English (United States)
Last Update
The Profiler's descriptive statistics offer a variety of functions to analyze and explore data tables in a Teradata database:
  • The Profiler's functions provide business insight.
  • The Profiler uncovers data quality issues that can jeopardize the accuracy of any models that are based on the data.
  • The Profiler isolates the data used in building analytic models. For example, outlying values may sometimes be excluded from a model; in other cases, these values might be required to solve a particular business problem.
  • Some processes used in analytic modeling may require a certain type of distribution of data
Through descriptive statistical analyses, the profiler can determine the suitability of various data elements for model input and can suggest transformations required for these data elements.

In the Profiler's Descriptive Statistics analyses, NULL values are handled through the generated SQL aggregate functions. This SQL ignores the NULL value and adjusts the number of observations in its calculation. This provides a deletion of NULL values.

Teradata Profiler provides the following statistical functions to analyze data.

Statistical Functions Description
Values analysis Counts the number of values for a given column or columns.
  • Number of rows
  • Rows with non-null values
  • Rows with NULL values
  • Unique values
  • Rows with value '0'
  • Rows with a positive value
  • Rows with a negative value
  • Rows containing blank values
Statistical analysis Determines the following statistics for numeric columns.
  • Minimum value
  • Maximum value
  • Mean value
  • Standard deviation
  • Skewness
  • Kurtosis
  • Standard mean error
  • Coefficient of variance
  • Variance
  • Sum
  • Uncorrected sums of squares
  • Corrected sums of squares
  • Values count
Frequency Computes the frequency of column values and the frequency of values for columns in a single column list. Generates simple statistics for any other column within table.
Histogram Determines the distribution of a numeric columns giving counts with optional overlay counts and statistics.
Text Field Analyzer Analyzes raw data to find out actual data type.
Scatter Plot Plots sampled values of two or three variables in 2-D
Overlap Counts overlapping column values in combinations of tables. Finds key values in common between tables.
Data Explorer Automates exploration of tables or views within an entire database.