5.4.5 - Cluster - INPUT - Expert Options - Teradata Warehouse Miner

Teradata Warehouse Miner User Guide - Volume 3Analytic Functions

Teradata Warehouse Miner
Release Number
February 2018
English (United States)
Last Update

This screen does not apply to the Fast K-Means algorithm.

  1. On the Clustering dialog box, click INPUT.
  2. Click expert options.
    Clustering > Input > Expert Options

  3. On this screen, select:
    • Width — Number of variables to process in parallel (dependent on system limits)
    • Input Sample Fraction — Fraction of input dataset to cluster on.
    • Scale Factor Exponent — If nonzero “s” is entered, this option overrides automatic scaling, scaling by 10s.
    • Minimum Probability Exponent — If “e” is entered, the Clustering analysis uses 10e as smallest nonzero number in SQL calculations.
    • Minimum Variance Exponent — If “v” is entered, the Clustering analysis uses 10v as the minimum variance in SQL calculations.
    • Use single cluster covariance — Simplified model that uses the same covariance table for all clusters.
    • Use Random Seeding — When enabled (default) this option seeds the initial clustering answer matrix by randomly selecting a row for each cluster as the seed. This method is the most commonly used type of seeding for all other clustering systems, according to the literature. The byproduct of using this new method is that slightly different solutions will be provided by successive clustering runs, and convergence may be quicker because fewer iterations may be required.
    • Seed Sample Percentage — If Use Random Seeding is disabled, the previous seeding method of Teradata Warehouse Miner Clustering, where every row is assigned to one of the clusters, and then averages used as the seeds. Enter a percentage (1-100) of the input dataset to use as the starting seed.