15.00 - EDITDISTANCE - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)

EDITDISTANCE

Purpose  

Returns the minimum number of edit operations (insertions, deletions, substitutions and transpositions) required to transform string1 into string2.

Syntax  

where:

 

Syntax element…

Specifies…

TD_SYSFNLIB

the name of the database where the function is located.

string1

a character string or string expression.

string2

a character string or string expression.

ci

the relative cost of an insert operation.

ci cannot be a negative value. If not specified, a default value of 1 is used.

cd

the relative cost of a delete operation.

cd cannot be a negative value. If not specified, a default value of 1 is used.

cs

the relative cost of a substitute operation.

cs cannot be a negative value. If not specified, a default value of 1 is used.

ct

the relative cost of a transpose operation.

ct cannot be a negative value. If not specified, a default value of 1 is used.

ANSI Compliance

This is a Teradata extension to the ANSI SQL:2011 standard.

Invocation

EDITDISTANCE is an embedded services system function. For information on activating and invoking embedded services functions, see “Embedded Services System Functions” on page 24.

Argument Types and Rules

Expressions passed to this function must have the following data types:

  • string1 = CHAR, VARCHAR, or CLOB
  • string2 = CHAR, VARCHAR, or CLOB
  • ci = INTEGER
  • cd = INTEGER
  • cs = INTEGER
  • ct = INTEGER
  • You can also pass arguments with data types that can be converted to the above types using the implicit data type conversion rules that apply to UDFs.

    Note: The UDF implicit type conversion rules are more restrictive than the implicit type conversion rules normally used by Teradata Database. If an argument cannot be converted to the required data type following the UDF implicit conversion rules, it must be explicitly cast.

    For details, see “Compatible Types” in SQL External Routine Programming.

    Result Type

    The result data type is INTEGER.

    Usage Notes  

    EDITDISTANCE measures the similarity between two strings. A low number of deletions, insertions, substitutions or transpositions implies a high similarity. The insertions, deletions, substitutions, and transpositions are based on the Damerau-Levenshtein Distance algorithm with modifications for costed operations.

    If either string1 or string2 is NULL, the function returns NULL.

    Example  

    The following query returns a result of 9.

       SELECT EDITDISTANCE('Jim D. Swain', 'John Smith'); 

    Example  

    The following query returns a result of 0 since the strings are the same.

       SELECT EDITDISTANCE('John Smith', 'John Smith'); 

    Example  

    The following query returns a result of 9.

       SELECT EDITDISTANCE('Jim D. Swain', 'John Smith', 2, 1, 1, 2); 

    Example  

    The following query returns a result of 11.

       SELECT EDITDISTANCE('John Smith', 'Jim D. Swain', 2, 1, 1, 2);