5.4.5 - Optimizing Performance of Clustering - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Product
Teradata Warehouse Miner
Release Number
5.4.5
Published
February 2018
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
yuy1504291362546.ditamap
dita:ditavalPath
ft:empty

Parallel execution of SQL is an important feature of the cluster analysis algorithm in Teradata Warehouse Miner as well as Teradata. The number of variables to cluster in parallel is determined by the ‘width’ parameter. The optimum value of width will depend on the size of the Teradata system, its memory size, and so forth. Experience has shown that when a large number of variables are clustered on, the optimum value of width ranges from 20-25. The width value is dynamically set to the lesser of the specified Width option (default = 25) and the number of columns, but can never exceed 118. If SQL errors indicate insufficient memory, reducing the width parameter may alleviate the problem.