5.4.5 - Clustering and Data Problems - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty

Common data problems for cluster analysis include insufficient rows provided for the number of clusters requested, and constants in the data resulting in singular covariance matrices. When these problems occur, warning messages and recommendations are provided. An option for dealing with null values during processing is described in Clustering and Null Values.

Additionally, Teradata errors can occur for non-normalized data having more than 15 digits of significance. In this case, a preprocessing step of either multiplying (for small numbers) or dividing (for large numbers) by a constant value may rectify overflow and underflow conditions. The clusters remain the same, as all this does is change the unit of measure.