When Reference Data Does Not Fit in Memory - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software

Use this syntax when neither input fits in memory. Partition the data by categorical attribute (for example, age range), thereby distributing the records to workers by the specified attribute and reducing the comparison times. If the categorical attribute has n values and each range has same size, the comparison times are reduced to 1/n.

Version 1.1
SELECT * FROM IdentityMatch (
  ON source_input_table AS a PARTITION BY key 
  ON reference_input_table AS b PARTITION BY key 
  IDColumn ('a.id_column: b.id_column')
  { NominalMatchColumns ('a.columnX: b.columnY' [,...]) |
    FuzzyMatchColumns ('a.columnX: b.columnY, match_metric,
      match_weight [, synonym_file ]' [,...]) }
  [ Accumulate ('{a|b}.accumulate_column' [,...]')]
  [ Threshold ('threshold') ]
);