Text Field Analyzer

Teradata Warehouse Miner User Guide - Volume 1Introduction and Profiling

brand
Software
prodname
Teradata Warehouse Miner
vrm_release
5.4.4
category
User Guide
featnum
B035-2300-077K

When dealing with character data, it is sometimes helpful to be able to examine this data and determine what actual data type the data could be stored in within the database. The Text Field Analyzer analysis can analyze character data and help distinguish whether the field is a numeric type, a date, a time, a timestamp, or character data. Text field analysis can readily be applied to any type of character data. Non-character data types go unprocessed and are passed along to the output just as they are defined in the input table.

Given a table name and the name of a column, the Text Field Analyzer analysis provides a series of tests to distinguish what the correct underlying type should be.
  1. The MIN and MAX test is performed on the field, where the MIN and MAX values of a column are retrieved from the database and tested to determine what type the values are.
  2. Sample test which retrieves a small sample of data for each column and again accesses what type they should be.
  3. Test for fields that have already been determined to be numeric and it tries to classify them in a more specific category if possible. For instance, a field that is considered a FLOAT type after the first two tests might really be a DECIMAL type with 2 decimal places.

    A date type is validated to make sure all values in that column are truly dates.

For general information about output, see OUTPUT Tab.