Text Field analysis examines character data and determines if the data can be stored in the database as a numeric type, a date, a time, a timestamp, or character data.
You can apply Text Field analysis to columns of any data type. If the columns contain noncharacter data, Text Field analysis only copies them from the input table to the output table.
Given an input table name and one or more column names, the Vantage Analytics Library textfieldanalyzer function follows this procedure for each column:
- Retrieves the minimum and maximum values of the column and tests the values to determine their data type.
- Retrieves a small sample of values and tests the values to determine their data type.
- Tries to further classify columns classified as numeric.
For example, if extendednumericanalysis=true, a field that the first two tests classified as FLOAT might be a DECIMAL type with two decimal places.
- Checks that all values in columns classified as date types are valid dates.
- If extendedunicodeanalysis=true, tests Unicode columns to see if the columns contain only Latin characters.