Text Field Analyzer - Teradata Warehouse Miner

Teradata® Warehouse Miner™ User Guide - Volume 1Introduction and Profiling

Product
Teradata Warehouse Miner
Release Number
5.4.6
Published
November 2018
Language
English (United States)
Last Update
2018-12-07
dita:mapPath
rfc1538171534881.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2300
Product Category
Software

When dealing with character data, it is sometimes helpful to be able to examine this data and determine what actual data type the data could be stored in within the database. The Text Field Analyzer analysis can analyze character data and help distinguish whether the field is a numeric type, a date, a time, a timestamp, or character data. Text field analysis can readily be applied to any type of character data. Non-character data types go unprocessed and are passed along to the output just as they are defined in the input table.

Given a table name and the name of a column, the Text Field Analyzer analysis provides a series of tests to distinguish what the correct underlying type should be.
  1. The MIN and MAX test is performed on the field, where the MIN and MAX values of a column are retrieved from the database and tested to determine what type the values are.
  2. Sample test which retrieves a small sample of data for each column and again accesses what type they should be.
  3. Test for fields that have already been determined to be numeric and it tries to classify them in a more specific category if possible. For instance, a field that is considered a FLOAT type after the first two tests might really be a DECIMAL type with 2 decimal places.

    A date type is validated to make sure all values in that column are truly dates.

For general information about output, see OUTPUT Tab.