IdentityMatch Function Syntax | Teradata Vantage - When Reference Data Does Not Fit in Memory - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Use this syntax when neither input fits in memory. Partition the data by categorical attribute (for example, age range), thereby distributing the records to workers by the specified attribute and reducing the comparison times. If the categorical attribute has n values and each range has same size, the comparison times are reduced to 1/n. In the syntax, a and b refer to the SourceTable and ReferenceTable, respectively.

Version 1.13

SELECT * FROM IdentityMatch (
  ON source_input_table AS SourceTable PARTITION BY key 
  ON reference_input_table AS ReferenceTable PARTITION BY key
  USING
  IDColumn (a.id_column: b.id_column')
  { NominalMatchColumns ('a.columnX: b.columnY' [,...]) |
    FuzzyMatchColumns ('a.columnX: b.columnY, match_metricmatch_weight [, synonym_file ]' [,...])
  }
  [ NullHandling ({ 'mismatch' | 'match-if-null' | 'match-if-both-null' }) ]
  [ Accumulate ('{a|b}.accumulate_column' [,...]')]
  [ ThresholdScore (threshold) ]
) AS alias;